THE ANNALS 
of 
MATHEMATICAL | 
STATISTICS 


(FOUNDED BY H. C. CARVER) 


THE OFFICIAL JOURNAL OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 


Contents 


On the Asymptotic Distribution of Differentiable Statistical Func- 
tions. R. v. MiszEs 


Approximate Solutions for Means and Variances in a Certain Class 
of Box Problems. Puiu J. McCarruy 

The Distribution of the Range. E. J. GuMBEL 

Low Moments for Small Samples: A Comparative Study of Order 
Statistics. Crcm Hastines, Jr., FREDERICK MOosTELLER, 
JoHun W. TuKry, AND CHARLES P. WINSOR 

Sequential Confidence Intervals for the Mean of a Normal Distri- 
bution with Known Variance. CHARLES STEIN AND ABRAHAM 


A Useful Convergence Theorem for Probability Distributions. Henry 
ScuEerrt 


An Explicit Representation of a Stationary Gaussian Process. M. Kac 
AND A. J. F. SreGERT 


Approximate Formulas for the Radii of Circles which Include a Specified 
Fraction of a Normal Bivariate Distribution. E. N. OsEre 


A Note on the Efficiency of the Wald Sequential Test. Epwarp Pautson 447 
A Note on the Poisson-Charlier Functions. C. TRUESDELL 


Abstracts of Papers 

News and Notices 

Report on the New YorksCity Meeting of the Institute . 
Report on the April Meeting of the Institute in Atlantic City 
Report on the San Diego Meeting of the Institute 


Vol. XVIII, No. 3 — September, 1947 





Insurance THE ANNALS 


Library 


WR OF MATHEMATICAL STATISTICS 
\ 
-Prhe 


Heke EDITED BY 
yy 4 


S. 8. WILKS, Editor 
M.S. BARTLETT HARALD CRAMER J. NEYMAN 
WILLIAM G. COCHRAN W. EDWARDS DEMING WALTER A. SHEWHART 


I 
ALLEN T. CRAIG J. L. DOOB JOHN W. TUKEY 
C. C. CRAIG W. FE A. WALD 


LLER 
HAROLD HOTELLING 


WITH THE COOPERATION OF 


CHURCHILL EISENHART Witu1am G. Mapow 
M. A. GrrsHIck ALEXANDER M. Moop 
Pau R. Hatmos FREDERICK MostTEeLLER 
Paut G. Hor. Henry Scuerré 
Marx Kac Jacos Wo.Fow!ITz 


The ANNALS OF MarHematicaL Sratistics is published quarterly by the | 
Institute of Mathematical Statistics, Mt. Royal & Guilford Aves., Baltimore 2, 
Md. Subscriptions, renewals, orders for back numbers and other business com- 
munications should be sent to the ANNALS OF MaTHemaricaL Statistics, Mt. | 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Insti] 
tute of Mathematical Statistics, P.S8. Dwyer, 116 Rackham Hall, University o i 
Michigan, Ann Arbor, Mich. 

Changes in mailing address which are to become effective for a given 
issue should be reported to the Secretary on or before the 15th of they 
month preceding the month of that issue. The months of issue are March,” 
June, September and December. gy 

Manuscripts for publication in the ANNALS OF MATHEMATICAL STATISTIOS | 
should be sent to S. S. Wilks, Fine Hall, Princeton, New Jersey. Manuscripts 
should be typewritten double-spaced with wide margins, and the original copy 
should be submitted. Footnotes should be reduced to a minimum and whens As 
possible replaced by a bibliography at the end of the paper; formulae in foot- 
notes should be avoided. Figures, charts, and diagrams should be drawn on 
plain white paper or tracing cloth in black India ink twice the size they are D 
be printed. Authors are requested to keep in mind typographical difficulti 
of complicated mathematical formulae. 


f " 
se 


Authors will ordinarily receive only galley proofs. Fifty reprints withou! 
covers will be furnished free. Additional reprints and covers furnished at cos 


The subscription price for the ANNALS is $5.00 per year. Single copies $1.50. | 
Back numbers are available at $5.00 per volume, or $1.50 per single issue. ~ 


CoMPOSED AND PRINTED AT THE 
WAVERLY PRESS, Inc. 
Bautimore, Mp., U.S. A, 


Entered as second-class matter at the Post Office at Baltimore, Maryland, under the Act of March 3, 1879 








ON THE ASYMPTOTIC DISTRIBUTION OF DIFFERENTIABLE 
STATISTICAL FUNCTIONS 


By R. v. Mises 


Harvard University 


TABLE OF CONTENTS 


Introduction 
Part I. Preliminary Theorems. 
1. Asymptotically Equal Distributions 
2. Special Class of Statistical Functions: Quantics 
. Asymptotic Expectation of Excess-Power Products 
. Asymptotic Expectation and Variance of Quantics..........................05. 317 
. Final Statement on the Limit of Expectation of Quantics 
6. Theorem on Products of n Functions 
Part II. Differentiable Statistical Functions. 
1. Definitions 
2. Taylor Development 
3. General Theorem 
4. Illustrations 
Part III. Second-Type Asymptotic Distribution. 
1. Statement of the Problem 
esa MMIII 6556 518s sre dad 6 Oa SNA Kim aK Ss ORSE ewes aoe evened 
ee eS WER I Dn 5 ois Saou be eee dada see budge ewwwedseseees 
eI | IN Ms clk wile cei ted ds acs Gwledelleweatackwadweteae de eae 338 
5. Transition to the Continuous Case 
References 


Introduction. If real variables x, , x2, --- , 2, are subject to a probability 
distribution with the element dV;(2,)dV2(x2) --- dVn(xn) one can ask for the 
distribution of any function f of 21, 2,---2,. Weare primarily interested in 
statistical functions, i.e. in functions that depend on the repartition’ S,(x) of the 


n quantities 7, , %,-°-: 2, only. The simplest case is that of the linear statis- 
tical functions 


() f= | vG@) dS.) == We) + ve) +++ + ve). 


The so-called Central Limit Theorem of Probability Calculus states that the 
distribution of a linear statistical function, if n tends to infinity, approaches 
more and more the normal (Gauss) distribution if some very general conditions 
linking ¥(x) and the V,(zx) are fulfilled. It has been shown, ten years ago, [2] 
that the restriction to linear functions here is immaterial. Much more general 


1 The function S,(z) is called the repartition of the real quantities x1 , t2, +++, 2n if 
nS,(xz) is the number of those among the 2; , 22 , «+ , Zn that are smaller than or equal to z. 


309 





310 R. V. MISES 


statistical functions tend towards normalcy with increasing n, for example the 
variance of mth order 


(2) f-M,= / (x — a)” dS,(z2), a= [ 2a8.(@) 


and, likewise, such combinations as the Lexis quotient M2/a(1 — a/N) or Gini’s 
disparity measure 1 — | (1 — S,)*dx/a or, in the multidimensional case, the 


correlation coefficient, etc. On the other hand, statistical functions are known 
whose distributions assume, asymptotically, a form different from the Gaussian. 
One example is Pearson’s Chi-square, another the test function w’, introduced 
by H. Cramér [1] and the author [4]: 


(3) gaa = [ /@18@) — V@f ae 


where g'(x) > O and 
(4) Poa) =* (Vie) + Vale) +++ + VA]. 


N. V. Smirnoff [7, 8] computed the asymptotic distribution of w* for the case 
that all V,(x) and, therefore, V,,(x) equal one and the same distribution func- 
tion V(x). The result differs widely from the Gaussian distribution. 

In order to understand all this it is necessary to consider f as a function de- 
fined in the space of distributions V(x) (or in a sub-space of it). Then, the vari- 
able f whose distribution is sought is the value of f{V(x)} at the “point” S,() 
and should be written as f{S,(z)}. Such “functions of functions” were first 
introduced by Vito Volterra (1887) and are today a familiar topic of higher 
analysis. The first statement that can be made is that,the asymptotic dis- 
tribution of f{S,(x)} depends mainly on the behavior of f{V(x)} at the point 
V(x) defined by (4). 

Volterra also introduced the notion of derivatives and of Taylor development 
for a ‘fonction de ligne.” Using these concepts a more specific statement can 
be pronounced: The type of asymptotic distribution of a differentiable statistical 
function f{S,(x)} depends on which is the first non-vanishing term in the Taylor 
development of f{V(x)} at the point V,(x); if it is the linear term the limiting dis- 
tribution is normal, under restrictions that can easily be derived from the Central 
Limit Theorem; in other cases higher types of asymptotic distributions result. 

The present paper tries to establish this theorem and to furnish preliminary 
information about the asymptotic distribution of the second type. 

If both the function f{V(x)} and the sequence of distributions V;(x), V2(2), 
V3(x), --- are defined independently of each other, it cannot be presumed that 
the derivative of f vanishes at V,(z). In this sense the normal distribution ap- 
pears as the ‘“‘general case’’ of an asymptotic distribution while the higher types 
represent certain “singularities.”” In the case of type m, (m = 1, 2, 3,--:), 





























DIFFERENTIABLE STATISTICAL FUNCTIONS 


the distribution of the expression 


(5) n™ "TF Su(x)} — f{Va(x)}] 


tends towards a function of bounded mean value and variance. For m = 1 
it is a Gauss function with mean value 0 and finite variance. For any uneven 
m the distribution is symmetrical with respect to the zero point. If f is given, 
the limiting distribution is essentially determined if in addition to V,,(x) one func- 
tion of two variables, U,(x, y), is known, 





U,(z, y) 


IA 


= > [V.(xz) — V.(x) V.(y)], (x < y) 


(6) 


n 


lS (v.y) — Vi(2) Vo(y)], (x 


MN val 


IV 


y). 





For instance, in the case of the linear function (m = 1) defined in eq. (1), the 
(second order) variance of (5) is found as the Stieltjes integral 


(7) [ vovw a0.(e, ») 


and no mean values of higher order are required for computing the moments of 
any order, whatever m is. 

For m = 2 the complete expression for the characteristic function of the asymp- 
totic distribution of (5) is developed in Part III of this paper. It has the form 


1 
(8) D(ui) 
where D(X) is in general the Fredholm determinant of a symmetrical kernel that 
depends on the second derivative of f{V(x)} at V = V,, on V, and on U,. 
If the V,(x) are discontinuous distributions with saltus at k distinct points only, 
D is the determinant of a quadratic form of k variables. This happens to be 
the case with Pearson’s x’ while the w” distribution found by Smirnoff represents 
a fairly general case of the asymptotic distribution of second type. 


PART I. PRELIMINARY THEOREMS 


1. Asymptotically equal distributions. Let K,, K2, K3;,--- be an infinite 
sequence of collectives, k, the number of variables in K, and A, , B, two func- 











tions of these variables, (n = 1, 2,3,---). The cumulative distribution func- 
tions of A, and B,, will be denoted by P,,(x) and Q,(x) respectively, i.e. 

(1) P,(x) = Prob {An S 2}, Q,(x) = Prob {B, S zx} 

and the expectation of | A, — B,| by 

(2) E,{ | An — Bal } 





all these quantities being taken with respect to the distribution in K, . 


312 R. V. MISES 


Two functions F(x) and G,(x) both depending on the parameter n are said 
to be asymptotically equal if 


(3) lim | F,(x) — G,(z)| = 0 uniformly in z. 


If this is the case for the cumulative distribution functions P,(xz) and Q,(x) of 
A, and B, we shall also say that A, and B, have the same asymptotic distribu- 
tion. Eq. (3) will also be written as F,(x) ~ G,(x). The following can be 
proved: 

Lemma A. Jf with increasing n the expectation of the absolute difference be- 
tween A, and B,, tends towards zero and if one of the functions P,(x) or Q,(x) is 
asymptotically equal to a function F,,(x) that has a uniformly bounded derivative, 
i.e. 

(4) lim £,{| A. — Ba |} = 0, ed <M for alln 
then A, and B,, have the same asymptotic distribution. 

This statement, in a slightly different wording, was proved in an earlier paper 
[2] and the proof will not be repeated here. If one of the various definitions for 
“stochastical convergence” is used, one can also say that A, and B, , under the 
stated conditions, converge stochastically towards each other. 

The Lemma A can be extended and modified in various ways. First, it is 
obvious that the expectation of | A, — B,| can be replaced by that of any 
positive power | A, — B, |“. With respect to F, one could ask for the existence 
of a bounded derivative in all points except for a zero set only. Then P, and 
Q, would still converge everywhere except for this zero set and the definition 
of asymptotically equal distributions could be extended to this case. In the 
present paper this will not be done as it is not our purpose to strive for results 
of the possibly greatest generality. 


2. Special class of statistical functions: quantics. Preliminary to the study 
of general statistical functions a special class which corresponds to quantics 
(homogeneous polynomials) of mth order must be discussed. Let Vi(x), Vo(2), 
V;(x), --- be the cumulative distribution functions in a sequence of one-dimen- 
sional collectives C; , C2, C3, --- and S,(x) the repartition of a sample drawn 
from the n-dimensional collective K, , with the distribution element 


AV 4(a1)dV2(x2) -+* dVa(an). 
We introduce 


T(x) = S,(x) — V,(2), V(x) = 


1 
N v=1 















DIFFERENTIABLE STATISTICAL FUNCTIONS 313 





Here, n7',(x) is obviously the excess of observed values < «x over their expected 


number. Quantics of first, second, third, --- order are then defined as 


flSa(z)} = [ v@) aT a(2) 


(6) fo{S,(z) } 


| | V(x, y)dTn(2)dT aly) 


fat Sa(x)} = | [ [ v(x, w, 2) dT aa) dT aly) aT a(2) 

all integrals to be extended over the total range of x. Of course, only such y 
for which the respective integral exists are admitted. The first, fi , is obviously 
a linear statistical function and the asymptotic distribution of +~/n f, is, under 
well-known conditions, a Gauss function with the mean value zero and the 
variance given in eq. (7) of the Introduction. In fo, f;, --- the y may be 
supposed to be symmetrical with respect to their variables. It will be seen 
later (Part II, sec. 2) that the first derivative of f. , the first and second deriva- 
tives of f; , etc. vanish at the point V,(z). 

All the above functions f; , fo , fs , --- can be considered (if the y are continu- 
ous) as the limits of ordinary quantics in k variables. Choose k disjoint inter- 
vals J; , Io, --- , J, on the x-axis, and call J,4; their complement. Denote the 
increment of V,(x) within J, by p,, and the increment of S,(x) by px. Ob- 
viously p,, is the probability, within C, , of x falling in the interval J; and np,, 


is the number of observed sample values in the same interval. We introduce 
the excess values &, : 


- . a l< 
(7) E, = Pnx — Pnx 5 Pn = * d Pve 


and form the sums 


k eer l--+% 
(8) fi oo Dek 5 Se = a War ke, fs = 2» Weru & En Ey , pa 


k=1 
By selecting suitable sets of intervals J;, Jz, --- , 7; and appropriate values 
for the constants y, , Ya, °° , one can approximate the integrals (6) by sums 


of the form (8). 

Our next task will be to find asymptotic values for the expectation and for the 
moments of the quantities defined in (8). Clearly a formula for the expectation 
of a power product éf££7 --- where a, 8, 7, --* are positive integers, is the 
only thing we need. To arrive at such a formula we replace each of the one- 
dimensional collectives C, by a k-dimensional C? in the following way. 

In C? the chance variable is a k-dimensional vector which can take (k + 1) 
distinct values only: it can be zero or coincide with the unit vector parallel to 





314 R. V. MISES , 


one of the k axes. To the latter values of the variable we assign the probabilities 
Pri» Pre, *** » Px and to the zero the probability 


(9) Prk = i- a aa oe = Da 


This quantity, of course, may vanish. The mean value of C? is the point with 
the coordinates pi, Pr2,°** , Dx- 

If the n collectives Cr , Cz , --- , Cx» are combined, the swm of the n observed 
vector values is a vector with the components npni, Npn,+** , NPne- If in 
each C? the origin is shifted to the mean value and the coordinates with respect 
to the new origin are called z, , 22, --- , 2 , the sums of the observed 2 , z2, --- , 
zx-values will be né&, , n&,--- , n& rather than npn, Npn,++: , MPa. Thus 
it is seen that all questions concerning the distributions of & , & , &,-++ can 
be answered on the basis of the well-known rules on the addition of n independent 
chance variables. This leads to the symbolic formula for the expectation: 


n asn Bsn ¥ 
(10) Ea (ng) *(n&)*(n&)” «++ } = (e Zn) (= Za) (> Zn) ins 


where on the right-hand side each term 
(11) Zin Dyn Zs * + 
has to be replaced by 


(11’) [ aaa G0). 


Here, obviously, V7(z) is the distribution function in C? and the expressions 
(11’) are in fact sums of (kK + 1) terms, for example 


[ az dV3 (2) = pull — pr)(—pw) + pro(—pn) (1 — py) 


(12) 


k+1 


+ d Dvi( — Pur) (— pre) _ — Pv Pr - 


It will be seen in the next section that only very few of these sums are needed 
for computing the asymptotic value of (10). Note that the value of (11’) can 
be expressed in terms of p,1 , Py, Pos, °° alone if & , f , £3, --- only appear in 
the product. 


3. Asymptotic expectation of excess-power products. We first consider the 
case where the sum of exponents a, 8, y, --- is an even number 


(13) atB+yt::: = 2m. 


On the right-hand side of (10) stands a sum of n’” terms, each a product of 2m 
factors Z,,. It follows from (11’) that the absolute value of a product cannot 
surpass 1. The second subscripts are the same in each term: first a ones, then 








he 





DIFFERENTIABLE STATISTICAL FUNCTIONS 315 










8 twos, y threes, etc. The first subscripts are in each term a combination of 


2m digits out of y = 1, 2,3,---,n. The number of those combinations which 
include s different v-values, (s = 1, 2, --- 2m), is 


00 (er = ORG 2)) 


Obviously, the K$”’ are bounded (independent of n). 
If s > m the combination of first subscripts must include at least one v-value 
that appears only once. All those products vanish since 
































(15) / 2. dV%\s,) = 0 fer all «,» 








due to the fact that the origin in the z-space coincides with the mean value of 
the distribution V7(z). Note that 














= oe 2 
in = () ™ | en ; 
m! , 





It follows that the sum of all terms in (10) that correspond to any s < m are 
of the order o(n”) or smaller. 


Thus, we arrive at an asymptotic expression for E, by dividing both sides of 
(10) by n”: 


ep PDA TPREM # F028) 











1 
(17) n” Enltt &E +++} ~ jm X (IZ) 








where only such products on the right-hand side are retained which include 
exactly m different v-values each appearing twice. 
In analogy to (12) we compute 











(18) [ezavie) = —PuPovx (c x k) 





= Pr(l = Pv) (e 


and write, for the sake of abbreviation 


Kk) 








(19) P® = pide — Drip = P& 








with the usual meaning of 6,, (= Oif« # cand = lift = x). Then the sum 
to the right in (17) includes (2m!) /2” terms, each a product of m factors P‘?. 
If each of the m couples :, « consists of two different figures, the respective prod- 
uct appears a! 8! y! --- times; if r couples are doubles (« = x) the multiplicity 
of the term is 2.’a! B!y!---. Therefore, (17) takes the form 


Bly! we 
(20) n™ Bet Bek} TEE open pee)... pow 


1") ¢2K2 &m*m°* 
n tsk 



































316 R. V. MISES 


In this sum the upper indices are any set of m digits out of 1, 2, 3,---,n 
and the subscripts are all sets of m couples including a ones, 8 twos, y threes, 
etc. To each such set of m couples belong (,,) terms of the sum. The number 
of sets of couples is bounded (independent of n). The exponent r is the number 
of doubles (« = x) among the m pairs. 

The expression (20) admits of a transformation which renders it much more 
suitable. Assume that as:t of couples ., « has been chosen according to the 
conditions and consider the product 


(21) (> Pe, re « \(X ated yr. 
v=1 v=1 


Among the n” terms which we obtain by developing (21) are all terms appearing 
in the sum (20), each of them repeated m! times and, in addition, 


(22) n”™ — (2)m! = n™ — n(n — 1)(n — 2) +--+ (n — m+ 1) 


other products of m factors P. Since the difference (22) divided by n”™ goes to 
zero with increasing n and each | P| is smaller than 1, the additional terms 
have no importance. We therefore introduce the quantities 


(23) = erty — 6, 2d Pu — i Pvi Pox - 


Then (20) can be written as 


' ! a 
(24) = n™ Eee eey + ~ er} De a 


Here we have a sum of a finite number of terms. It will be supposed in all that 
follows that the P.,, as defined in (23) do not vonish identically as n increases in- 
definitely. 

Since in the sum (24) no upper indices appear, equal terms repeat themselves. 
We can, therefore, rearrange it, using the polynomial coefficients and absorbing 
at the same time the factor 2-". The final form of (24) is given in the following 
Lemma B; , which also includes a statement for the case of an uneven sum of 
exponents a + 6+ y+ ---. In fact, it is easily seen that if again half the 
sum is called m, no group of terms on the right-hand side of (10) exists that 
would supply a finite limit when divided by n”. Thus we arrive at 

Lemma B,. If né, is the numerical excess of observed over expected quantities 
falling in the interval I,, the asymptotic expectation of the excess-power product 
tf éy --- is given by 


— sso} ~ 0 fat B+yt+-:: uneven 


& Bly! D 1 D o D 712 go 
(25) ~D ——— ($Pu)"* (4P oo)? ++ Py? Py? --- 
g 12! 


me “ie wae 


fatB+yt--: even 













ae @o mew vus vw ° 


on 





DIFFERENTIABLE STATISTICAL FUNCTIONS 





the sum to be extended over all sets of non-negative integers o1, 72, °° 
that fulfill the conditions 


(25) on = 3(a@ — on — os — ***), O22 = 3(B — on — 923 — +++), °° 


The P,, as defined in (23) depend on two groups of mean values only, namely on 





(25’’) px =~ . Prox and Di De ie 7 Dv. Pox « 













Some properties of the matrix P,, will be discussed in the next Section, 
For practical computation, instead of (25), a recursion formula may be used 
which follows immediately from (24). Writing simply (a, 8,7, ---) for the sum 
in (24) the formula reads 


(a, B,Y, °°") = (a — 2,6,y, --+)Pu + (a, B — 2, 7, +++) Poe + mee 
+ (a — 1@=- 1,7) -+-)Pp + (a, B — lLy- 1), -++)Pos + amp 


If all the original distributions V,(x) are equal, this recursion formula, and from 
it (25), can be derived almost immediately from the theorem on the multiplica- 
tion of characteristic functions with the addition of chance variables. 

Note that the expectation of the product £,; is P,,/n for any value of n. 













(26) 






4, Asymptotic expectation and variance of quantics. We first state a char- 
acteristic property of the expression (25) for the expectation of an excess power 
product. Let us denote by Ca,s,y,... the right-hand side of (25) in the case 
ofevena+8++7y+---. Then, if C.,3,,... is expressed in terms of P,, and each 
time the subscript 2 is changed into 1, we arrive at the value of Cais.0,¥,.... 
This would not be the case if Ca,s,y,... were expressed in terms of p, , since e.g. 






Cu = Pu = pi — tipi» Cy = Py = —pipr2- 





In order to prove the statement we observe that the C..s,,,... can be derived 
from the coefficients in the development of the mth power of a quadric: 


(27) >. Pi4a)” wat 2, Co.8.. ~ eq - 
2 


alBly!..- 
It follows that 
















I 
7 m 
(27 ) Ca.8.7.°+° ~~ m! ott ae aoa -[@ a Put. t,) }: 
If in the subscripts of P,, the ones and twos are identified, the quadric becomes 
a function of t; + t:, ts, t4, --+ and the derivative with respect to @ tf att equals 


ete 


the derivative with respect to @ ¢ On the other hand, the latter derivative 
corresponds to the value of * aoe in the form (27’). 
Taking m = 2,a = 8B = y = 6 = 1, eq. (25) supplies 


(28) WE EEE Ey} — P..P,, + PP. + PiuP a ° 


318 R. V. MISES 


According to the above statement this is correct whether 1, x, \, u are or are not 
different from each other. Thus, if y.., is a symmetric set of constants, we 
have 


(28’) n® En{ 2 Wirdu &, &, tr é,} — 32 Wiedu Pes P,,. 


In general, the numerical factor to the right, i.e. the number of sets of couples 
drawn from 2m figures, is (2m)!/2"m! = 1-3 --- (2m — 1). . Thus we can 
state: 

Lemma B.. If a quantic fom is defined according to (8) with symmetric coeffi- 
cients, its asymptotic expectation is given by 


(29) n™En{fom} ~ 1.3.5 °+* (2m — 1) D0 WaueesetamP uray *°* P 


Before applying this to the continuous case defined in (6), let us consider some 
characteristic properties of the matrix P,,. According to the definition (19) 
of PS we have 


Leosk k k . 
(30) Sin a St - (= puts) 
t,« t=1 


t=l 


t2m—142m °* 


and using (9) one easily derives from Schwarz’ inequality 
Lprti as (Spyt.)” 2 Pv, k41=Prit. 5 


Since P,, is the arithmetical mean of the P‘. it follows that the matrix P," 
is at least semi-definite and is positive definite except when all p,44. = 0 
In the latter case (if e.g. the k intervals cover the whole z-axis) one has 


ce iz k k 2 : 
(31) z. Ps = 2 >> Pn = (> ps) | = . 2d Dv roi (1 a Dv k+1) = 0 


which shows that here the reciprocal matrix P*, does not exist. 

In the “complete” case, that is, with all p,.41 = 0, the elements in each 
horizontal or vertical line of the matrix P,, have the sum zero. It follows that 
the k homogenous equations =P,,x, = 0 have the solution x; = v2 = +++ = % 
and, therefore, that the cofactors of all elements of P,, have one and the same 
value. For each single v the determinant of P‘ can be computed: 


iP | — PuPv2 rr PvkPv k+1 


If this is applied to the principal minors of the same determinant in the case 
P»,k41 = O, one finds the characteristic equation of the matrix PSY to be 


d 
| bux = APY | a dd (1 a AP») (1 _ Nps) == (1 = Dox). 


This shows that (k — 1) characteristic roots separate the abscissas 1/pn, 
1/p2,°** , 1/p.x (one root being zero). 

The number & of intervals has nothing to do with the preceding argument 
leading to the eqs. (25) to (28). Also can the entire computation be repeated 








DIFFERENTIABLE STATISTICAL FUNCTIONS 319 





in terms of dT7',(x1), dT'n(x2), dT,(xs) , «++ instead of & , &,&,--- if appro- 
priate differentials are substituted for the P,,... To find the latter ones we note 
that p,, stands for the increment dV,(xz). Thus, using 6(z, y) in analogy to 
5. (= lforx = y and = Oforz $¥ y) we set 


dU,(x, y) = 4(x, y) dV,(x) — dV,(zx) dV,(y) 
= (zx, y) dV,(x) — dW,(z, y) 
which is equivalent to the definition of a function of 2 variables: 
U,(a, y) = V(x) — V(x) V.(y) = V(x) — W,(z, y) (x Sy) 
Vify) — V(x) V.(y) = Vy) — W,(2, y) (ct 2 y). 
Then P,, has to be replaced by 






(32) 








(33) 








(34) dU,(x, y) = ~ : dU,(x, y) = 6(x,y) dV,(x) — dW,,(z, y). 





This dU,(z, y) is the expectation of dT,(x) dT,(y)/n. 
The function 





= l< 
(35) U(x, y) = . dX U,(x, y) 





is the difference of two cumulative distribution functions, one corresponding to 
a distribution along the straight line x = y with the element dV,(x) and an- 
other distribution over the whole plane with the element 







(35) aW a(x, y) == > av,(a) dV,(y). 





To each one-dimensional distribution V,(x) belongs one ‘distribution excess” 
U,(z, y) as defined in (33). The P‘. are the increments of U,(x, y) within 
the product interval drdy. It is seen from the preceding argument that the 
asymptotic moments of any quantic (6) or (8) depend only on the average U, 
of the distribution excesses U, . 

If a quantic is defined by (6) and the integrals on both sides exist, the asymp- 
totic expectation of fem may be written in formal analogy to (29) as 








n"By{fam} ~ 13.5 +++ (2m—1) ff + [ versa, +++ stom) 


x dU. (21 . Le ) dU,.(x3 ’ Ls) *°° dU ,(tom—1 »L2m) + 


This formula is identical with (29) if y has constant values in a finite number 
of intervals and vanishes outside these intervals. But it will be seen in the next 
section that (36) can be used in more general cases also. 

For the sake of practical computation one may develop the righthand side 





(36) 













320 R. V. MISES 


of (36) into terms explicitly depending on the given averages V,,(x) and W,(z, y). 
For example, in the case m = 3: 


mE nfo) ~ 1.35 [f [Goes sa ey 22 ys, 22) AV u(t) dP ace) AV (a) 


(37) - 3Y(x, 9 %1,%2, 22,9, 14) dV, (2) dV »(22) dW,,.(x3 ) X3) 
+ 3y (a1 ’ v1 ’ Le ’ X33 ’ XM ? Zs) dV ,(a1) dW,,,(x2 ’ X3) dW (x ’ 9) 
= V(x, > 2 ,%3 4X5, Xe) dW, (a1 , X2) dW,,.(x3 , %4) dW,(xs, 26) | 
In the general case, the numerical factors in the m-tuple integral are the binomial 
coefficients of order m. 

The higher moments of quantics fm can be compiited in the same way as 
Exntfm} since any power of f,, is a quantic again. The formulas, however, be- 
come more involved since the coefficients of f,, are not immediately given in a 
symmetric form. It will suffice to show here how the (second order) variance 
of f.can be found. The second moment is the expectation of 


(39) f: = Il Va, y)W(z, wu) dT, (x) dT,(y) dT,(2z) dT, (u). 


Applying here eq. (28) we have 


NE, Nfs} ~ ie y)W(z, u)[ dU, (a, y) dU,(z, u) 
(40) ae _ . : 
+ dU,(2, z) dU,(y, u) + dU,(2, u) dU,y, 2)], 


The first term in the brackets leads to the square of n E,{fo} while the second 


and third terms, due to the symmetry of (xz, y), supply two equal integrals. 
Thus 


Var {nfo} ~ 2 I V(x, y)W(z, u) dU, (2, 2) dU,(y, u) = 


(41) 2| [[ verve aP.@) ay) — 2 ff ve w¥u,2) dP aly) dW al,2) 


+ I V(x, y)W(z, u) dW,(a, z) dWaly, » 


In the same way moments and variances of any order can be computed for any 
quantic fn. 


5. Final statement on the limit of expectation of quantics. We shall prove 
the following: 













ny 


ve 


DIFFERENTIABLE STATISTICAL FUNCTIONS 321 





LEMMA B;. Given a sequence of distributions V;(x), Vo(x), V(x) , «++ and a 
quantic of order 2m 


fom = I oe [ ver, m, 


assume that there exist a continuous funcliun V(x) and a distribution V(x) such that 
| P(2y »t2,°°* Lam) | < V(x) W (a2) +++ W(2om) 
dV,(x) S dV(zx) for y= 1,2,3,--- 


and that the integrals 


’ Lem) dT (x1) dT (x2) eee dT, (2m) 


(42) 
|e) > Z, 


(42") | @'(z) dViz), (= 1,2,--- de), 


have finite values. Then, for any 6 > 0 


(43) lim n” *E,S fom} = 0. 

This lemma, on which the main theorem of Part II is based, will be estab- 
lished if it is shown that the formula (36) holds true for functions y satisfying 
the conditions (42). 

In the transition from the complete expression (10) for the expectation E, 
to the asymptotic value (25) two essential steps were made. First, certain 
products of the form (11) have been omitted and, second, certain products 
of P\? as defined in (19) have been arbitrarily added. This was allowed be- 
cause each of the products was seen to be smaller than 1 and their number was 
of the order O(n”). Ifa quantic in integral form (6) is considered which 
involves an infinite number of expressions like (10), a sharper estimate is 
necessary. 

It is easily seen that each integral (11’) is a polynomial in p,, including the 
product pyppe --: and another factor which is certainly bounded whatever 
the p,, are. Thus, if the expectation of ££ --- £2, is computed, each term of the 
form (11’) consists of a finite factor and the product p,,py2 +++ Py2m. In passing 
to the expectation of the quantic, the p,, have to be replaced by dV,(x,) and 
each neglected term in (10) leads to an expression like 


(45) [[ --- [¥@aa es 


According to the assumptions of B; this integral has a finite value. The num- 
ber of neglected terms being of the order O(n” ‘) the omission of these terms is 
justified. 

On the other hand, products of PS equal, except for the sign, products 
of p,.p». as long as . ¥ « and, except for a finite factor, products of p,, as often 
asi = x. Again it is seen that the arbitrarily added terms sum up to integrals 














» tom) AV,, (1) dVy,(X2) +++ AVix (2x). 







tastes a2 


7448 





tarys z 






syreuss i? 


322 R. V. MISES 


of the form (45). This shows that here too, if the conditions of B; are fulfilled, 
the procedure leading to (25) may be applied. 

It follows that, under the conditions (42), if the integral (42’) has a finite 
value, eq. (36) is correct and (43) is an immediate consequence of it. On the 
other hand, it is obvious that weaker conditions than those given in B; would 
suffice to establish (43). 


6. Theorem on products of n functions. The principal source of all explicit 
formulas on asymptotic distributions lies in certain properties of products of a 
great number of factors. Laplace devoted a part of his fundamental Treatise 
of Probability to these problems, but a complete outline of all results from a 
modern point of view is still lacking. In the third part of the present paper, a 
rather simple statement on this line will be used which may be formulated here as 

Lemma C. Let F,(2,,2,-°+- , 2), (v = 1,2,3, ---), be a sequence of analytic 
functions of k complex variables and G,, the product F\F, --- F,. Suppose that 
at the point 2; = z = ---+ z = Oall F, have the value 1, vanishing first derivatives, 
and the second derivatives 


4” = oF, 


(46) 02, 02," 


Then 


uniformly in each bounded region | z,| S Z in which the absolute values of the third 
derivatives of all F, have an upper bound M. 
In fact, the Taylor development of F, supplies under the conditions stated: 


(48) F(z, 2,°°° , 4) =~1+4 >) A®22 + O(2Z) 


and, therefore, 


(48’) log F(z, 22, °° , 2%) = td A™z.2, + 0(Z’). 


If here all z, are replaced by z,/ va n and the equations added for vy = 1,2, 
we obtain 


22 


21 a 
(49) log G, ( wat s/n’ Me) mg on a Aloeee + + n0(; Vi) 


and this shows that the brackets on the left-hand side of (47) are O(Z/+/n).— 
It is obvious that (47) would still hold if the condition concerning the third 
derivatives is replaced by a somewhat weaker one. 




















DIFFERENTIABLE STATISTICAL FUNCTIONS 


PART II. DIFFERENTIABLE STATISTICAL FUNCTIONS 


1. Definitions. We consider a one-dimensional cumulative distribution func- 
tion V(x) as a point in the V-space. If two points Vi(7) and V2(x) are given 
the functions 


(1) Vi(x) + t[V2(z) — Vi(z)], OStS1 


represent the straight segment between Vi(x) and V2(x). A subset of the V-space 
that includes all segments determined by its elements is called a convex domain. 
Now, assume that a sequence of collectives with the distributions V,(z), 








V.(x), V3(x) , --- be given. We shall consider functions f{V(x)} defined in a 
convex domain that includes particularly: (1) all average distributions V,(z) 
(2) Vaz) == 2 Vela) 

v==l 


at least from a certain n on; (2) all repartitions S,(z) that can occur, i.e. the 
repartitions of n quantities that belong to the label sets of the given collectives 
(e.g. positive x, etc.). If V°(x) and V(z) are any two points of the domain, the 
quantity 


(3) F(t) = f{V°(x) + {V@) — V@)]}, OSts1 


is a function of the real variable ¢. It will be supposed to admit derivatives 
with respect to ¢ up to the order r + 1. 

Following Volterra [9, 10] we define (in a slightly modified way) the derivative 
f’ of a statistical function f in analogy to the set of partial derivatives of a func- 
tion of several variables. If V(x) would stand for a set of distinct variables 
V:, Vo, V3, --: and V°(z) for their initial values V} , V2, V3, --- one would 
have 


© cceiie Go me ee po YF w _ y 
Gi V@) + AV@) — VAMao = LF Vy — Ve) 


where @ f/d V, is the partial derivative of f with respect to V, taken at the point 
V, = V;. Thus we write 


QS VG) + AV@) — VON = | £1V°@), nav — VOW) 


dt 
and call f’ which depends on-V°(x) and on a scalar variable y, but not on V(z), 
the (first) derivative of f{V(x)} at the point V°(x). Only if a relation (4) is 
fulfilled for any two points of the convex domain, f is called a (one time) differen- 
tiable function. 
The derivative of a linear function 


(5) A=fa(z)dV@), B= [ a) ave), --- 









324 R. V. MISES 


is simply the factor a(y), B(y) --+ respectively, independent of the point at 
which the derivative is taken. If f is given as a function of A, B, --- one has 
0 


(6) F1V@), 9) =F adi+ aw +++. 


The derivative of the non-linear function 

(7) p= | { ve, av@ ave) 

is 

(8) stv), v= [ We, v) + vy, Dav"). 


Note that an additive constant in f’ (i.e. a quantity independent of y) has no 
significance since the integral of d(V — V°) vanishes. It follows from (6) 
that the first derivative of the mth order variance as defined in (2) of the Intro- 
duction, at the point V°(z) is 


(9) (y — a)” — my / (x — a)” dV(zx) 


where a is the mean value of V°(z). 

In the same way derivatives of higher order can be introduced. The second 
derivative of f{V(x)} is a function of V°(z), i.e. of the point at which the deriva- 
tive is taken, and of two scalar variables y, z which correspond to the two sub- 
scripts in the case of a function of distinct variables. The definition of 
f’{V(x), y, 2} is given in the equation 


SAV) + V(x) — V°(x)]} 20 


(10) 
= f[riv'@.n2 av — voq av -V9@. 


The second derivative of a linear function is zero. The function (7) has the 
second derivative ¥(z, y) + ¥(y, z) independently of V°(x). The mth order 
variance gives, twice differentiated 


(11) —2mz(y — a)” + m(m — ye | (x — ao)” dV*(z). 


The variables y and z in f” or in any additive term of f’’ may be interchanged 
and a term depending on one of them may be added or omitted. Thus, f’ 
can always be written as a symmetric function of y, z without linear terms 
Accordingly, the second derivative of (7) is also 2y(y, z). 










DIFFERENTIABLE STATISTICAL FUNCTIONS 325 





The derivative of 'rth order of f at the point V°(x) will be defined by the 







equation 
- f{V%a) + AV (2) — Va 0 
(12) 


= ff [IOs n vs ud a — VOW) + a = VV). 
Here, for given V°(x), f‘” may be supposed to be a symmetric function of the r 
variables y1 , y2, °°: , Y,. The rth derivative of the mth order variance is 


(—1)"m! 
(m —r +1)! YiY2°** Yr 













(13) ; sie 
x | om —rt+) | (a — a)" dV°(x) — %— sh. 
k=1 « 
In the case r = m the expression becomes independent of V°(x), viz. ‘ 










(13’) 





(—1)"m! yry2 +++ Ym(1 — m) 





where terms depending on less than r of the variables y1, y2, ---, y, have been 
omitted. 

If the definitions (4), (10), (12) are confronted one can see that f’’{V, y, z} 
is the first derivative of f’{V,y} etc. For proofs see [9] and [10]. 











2. Taylor development. 


The function F(¢) defined in (3) admits the develop- 
ment 










wn ao, 1 Td (a ae 1 (r) 1 (r+1) 
(14) FQ) — FO) = FO +5F"O + +o F'"® +e5 pF (3) 


where #3 is some quantity between zero and one. According to (3) the left-hand 
side equals the difference f{V(x)} —f{V°(x)}. The expressions F’(0), F’’(0),---, 


F‘(0) are the derivatives as defined in eqs. (4), (10), (12). In the last term 
to the right, one has to introduce the distribution 









(15) V'(z) = Vz) + o[V(z) — V(x) 








and then to take the (r + 1)st derivative of f at the point V’(z). 

For a given V°(x) each one of the terms on the right-hand side of (14) is a 
function of V(x). Except for the last one—in which # depends in a certain way 
on V(x)—they are quantics with respect to V(x) — V(x), of the same kind as 


those considered in Part I. (There we had S, instead of V and J, instead 
of V°). 





326 R. V. MISES 


The rth term of (14) can be written as 


(16) Fe = 5 ff + [utero + 2) AV — VO) + dV — VVC) 


r 
where according to (12) 
(16’) V(x peng’ *s Xr) me f° (V(x), U1, %, °°" y Zr}. 


To find the characteristic properties of F, we compute its derivatives at apoint 
Vi(x). To do this we must replace in (16) the V(x) by 


Vil) + V(x) — Vilz)] 
then differentiate the product 
(17) Tali — VV @) + (V — Vi) @)I 


with respect to ¢, and finally set ¢ = 0. The derivative consists of r terms 
the first of which will be 


av — Vd@) IL av. — V")@). 
Due to the fact that y may be supposed as a symmetric function, all r terms 


supply the same integral. Thus the derivative of F, with respect to ¢ at the point 
t = 0 can be written as 


aan I ae [ve.n, oe d(V — Vi) (a) Il av, —_ V°)(2,). 


Comparing this with the formula (4) which defines the first derivative of a 

statistical function and writing y instead of x and V(x) instead of Vi(x), we find 
F,AV(2),y} = 

(1 


8) 
aoa [+ [eases ay — VG) + AV - V9). 


This is the first derivative of F,{V(x)} at the point V(x). It vanishes at the 
point V(x) = V°(z). 

The integral in (18) has the same form as that in (14) except that its multi- 
plicity is (r — 1) rather than r. Thus it is immediately seen how the higher 
derivatives of F, can be found. For the second derivative F,{V(x), y, 2} 
we have simply to replace (r — 1)! in (18) by (r — 2)!, then 2, by z and finally 
to omit in the product the differential d(V — V°)(x2). This procedure can be 
continued up to the derivative of order (r — 1). The rth derivative, finally, 
















DIFFERENTIABLE STATISTICAL FUNCTIONS 
will be 


(19) FS?{V (x), Yrs Yon ey Yr} = WY 5 Yrs oy Yr) 


independent of V(x) and, according to (16’), equal to the rth derivative of 
f{V(x)} at the point V°(x). It is also seen that all integrals of the form(16) 
or (18) vanish if V(x) equals V°(x). The results can be summarized as follows: 
The sth term, (s = 1, 2, --- r), of the development (14) is a function of V(z) 
for which all derivatives at the point V°(x) except that of order s vanish while 
this one equals the sth derivative of the original function f{V(x)} at V°(z). 
The complete analogy of (14) with the Taylor development of a function of 
distinct variables is thus evident. 

If we assume that f{V(x)} is a function whose first (r — 1) derivatives vanish 
at the point V°(x), eq. (14) takes the form 


V(z) — V°(a) = 1 If eine [P°V°O, nv, venoms 


r! 


-d(V — V")(y) --- dV — V°)(y,) 


‘‘ ry If Beis [PW nH » Yr+s} 


-d(V — V°) (y) os AV — V°) (Yrs). 


By applying to this formula the lemmas A and B of Part I, we shall arrive at 
the general theorem on asymptotic distributions that is the principal goal of 
this paper. 


(20) 


3. General theorem. The main result to be derived in the general theory of 
asymptotic distributions is that the so-called normal distribution represents 
the first element in an infinite sequence which includes the asymptotic dis- 
tributions of all differentiable statistical functions, except certain irregular 
cases. The Gauss distribution covers in fact only those functions whose Taylor 
development starts with the first (linear) term, in particular the linear statistical 
functions themselves. If the first (r — 1) terms in the development vanish, 
the asymptotic distribution of type r becomes valid. 

THEOREM I: Let Vi(x), Vo(x), Va(x), --- be an infinite sequence of distributions 
and f{V(x)} a statistical function with derivatives up to order (r+1). Denote by 
S,(a) the repartition of the n label values in the collective with the distribution element 
dV;(x), dV2(x) --- dVn(x) and by V,(x) the arithmetical mean of V,(z), 
Vo(xz) , --: , Vax). If for large n the first (r — 1) derivatives of f{V(x)} at the 
point V,(x) vanish and the rth derivative equals Waly, yz, *** 5 Yr), then the 
distribution of 


(21) Ayn = n™[f{Sa(x)} — f{V (x) }] 


328 R. V. MISES 


is asymptotically equal to the distribution of the rth order quantic 


r/2 


-d(S, — V,.) (a1) d(Sn — Wn) (x2) «++ d(Sn — Vn) (2p) 
under the following conditions: 
a) The distribution of (22) has a uniformly bounded derivative for all n; 
b) Within a convex domain in the V-space that includes all V,(x) from a certain 
n on, and all S,(x) that can occur, the (r + 1)st derivative of f{V(x)} is smaller 
in absolute value than a product V(yi)V(y2) --- W(y-41) whereby the 


integrals [wr dV,(x) fork = 1, 2, ---, 2(r + 1) have a finite upper 
bound fory = 1,2,3,°---. 

In order to prove this we introduce in eq. (20) S,(a) for V(x) and V,(zx) for 
V°(z), and multiply both sides by n””. Using the notations (21) and (2) and 
writing 7’, for (S, — V,), the equation reads 

A, — B, = 
(32) r/2 . 
a I oS mn (x), 1 ee »Yrti} dT ,,(ys) ges AT »(Yr41)- 


According to Lemma A the theorem will be verified if we can show that the 
expectation of the absolute value of the right-hand expression in (23) tends to 
zero. 


According to the Schwarz inequality one has, for any real C: 
(24) E,{|C |} S VE,{C%}. 


For fixed values of V, and S, the integral on the right-hand side of (23) is a 
quantic of order (r + 1) with the coefficients Y,4i(y1, ye, «°°, Yrai). The 
square of this integral is a quantic of order 2(r + 1) whose coefficients are a finite 
number (depending only on r) of terms each of which is a product of two p,+4:- 
values implying 2(r + 1) variables y: , y2, +--+ , Yarr+1). The absolute value of 
these coefficients is, therefore, according to the condition b) smaller than a 
finite factor times the product W(y;) V(ye) +--+ V(Yyor41)) and thus fulfills the 
condition of lemma B;. If the right-hand side of (23) is identified with C, the 
expectation of C’ is, except fora finite factor, the product of n’ times the expectation 
of the above-mentioned quantic of order 2(r + 1). . It then follows from lemma 
B; that the limit of Z,{C’} is zero and from (24): 


lim Z,{|C,|} = lim £,{| An — B,|} = 0. 


n=O 


This accomplishes the proof of Theorem I. 


If we apply here what was shown in Part I about the asymptotic distribution 
of a quantic, we can also state the following. 














DIFFERENTIABLE STATISTICAL FUNCTIONS 329 


THEOREM II: Under the conditions of Theorem I, the asymptotic distribution of a 
differentiable statistical function f{S,(x)} is essentially determined by 

a) the average distribution V(x); 

b) the first non-vanishing derivative of f{V(x)} at the point V,(x); 

c) the average distribution excess 


U,(2, y) = Dalz) — = > V.@)V.0), ny 
(25) a 
= V,(y) — 2 2» V(x) V,(y), a > y. 


By “essentially determined” is meant determined except for an additional 
function whose moments of any order are zero. The statement then follows 
from Theorem I in connection with the fact that the asymptotic moments of 
quantics have been computed in Part I from the values of U,(2, y). 

That functions with all moments vanishing exist has been known for a long 
time. A simple example given by Shohat and Tamarkin [6] is the following. 
Let x be a positive constant smaller than 3, and u = 2", k = tan xz. Then, 
the density (positive or negative) 


(26) g(a) —_ oe sin (ku) si Im e wae) 


fulfills the condition. In fact, the nth moment of (26) is the (vanishing) imagi- 
nary part of the integral 


(27) 1 [ yorro-1 e Hak) du = (—1)*" (cos eer Tr (2+1) 
k Jo K K 

Since g(a) takes negative values of the amount e “ it can be superimposed to a 
given distribution density only in cases where the original density remains 
greater than some multiple of e “= exp (—2*). It can be shown that the moment 
problem is determinate (i.e. the distribution determined by the moments in a 
unique way) if the density vanishes at infinity at a sufficiently strong degree. 

From the standpoint of statistical theory two distributions with the same 
moments throughout may be considered as equivalent. This justifies the ter- 
minology used in Theorem II. On the other hand, Theorem I is independent of 
this restriction: The asymptotic distribution of the statistical function f{S,(x) } 
is under the given conditions identical with that of the corresponding quantic 
of mth order. A detailed discussion of the case m = 2 will be given in Part III. 
Here follow some illustrations for the general case. 


4. Illustrations. The existence of asymptotic distributions of higher types 
can be exemplified in a comparatively simple way if we start from any known 
asymptotic distribution of a statistical function. 

Let us assume that g{V(x)} is a function fulfilling the condition 


(28) 9{V.(x)} = 0 





330 R. V. MISES 


for all n, and that the asymptotic c.d.f. for g{S,(x)} is known. There will be 
some positive integer r- such that 


(29) Prob [g{S,(z)} < zn™?] ~ @,(z). 


If, for instance, g is a linear statistical function r will be 1 and, under well- 
known conditions, &,(2) a normal (Gaussian) c.d.f. with finite variance depend- 
ing on n. 

Now, let f be an ordinary function of g and thus another statistical function 
which may be denoted by f{V(a)}. According to the rules of differentiation 
we have 


(30) riv@,y} =Zytv~@, 9 
dg 


and analogous relations can be derived for the derivatives of higher order. In 
particular, the following statement, valid in ordinary differential calculus, holds 
true: If g{V(x)} has derivatives of every order and if the first s derivatives of f 
with respect to g vanish at some point g = g{V.(x)} then also the s first deriva- 
tives of f with respect to V(x) will be zero at V(x) = Vi(x). In this way we can 
devise statistical functions, with vanishing derivatives, for which the asymptotic 
distribution is known. 

For the sake of simplicity we may assume that (29) holds with r = 1 and 
that f(g) is a monotonic increasing function, given in the form 


(31) fg) =g'll + a(g)] 
with s a positive integer, and the inverse function 
(31’) gf) =f" + B(f)] 
where B(f) goes to zero with f > 0. Then, from (29): 
(32) Prob [f{S,(x)} < zn” ’?] ~ ,(2’) 
if z and z’ are connected by 
ne! = g(n 2) = nte"*[1 + Bn 2). 
It follows that 


1 
J ~~ 7" ~® 


and if ,(z’) is supposed to be continuous, (32) becomes 


(33) Prob [f{Sa(a)} < en™*'”] ~ @,(2""). 


This is a distribution of type s. 
Take as an example for g the arithmetical mean 


ag Se 


(34) giSn(x - 



















DIFFERENTIABLE STATISTICAL FUNCTIONS 331 





where 2 , %2, °** , Yn are the observed values and 4, is the arithmetical mean 
of the mean values of V,(x). Then, under certain restrictions for the V,(z), 
there exists a bounded sequence h%, so that 


Prob[~/ng < 2] ~ &,(z) = + [. ee du. 
Now if we choose 
f= 6g -sing) = ¢(1-F +--+) 
the asymptotic distribution of f will be given by 
Prob [n-~/nf S 2] ~ ®, (Wz) = ve (C et dy 


with the probability density 
hn 


3-5 





3g 2!8) ghee 


Similar examples can be drawn from the asymptotic distribution of nx’ if one 
asks for the distribution of appropriate functions of nx’, etc. 


PART III. SECOND-TYPE ASYMPTOTIC DISTRIBUTION 


1. Statement of the problem. We now propose to study the asymptotic 
distribution of a quantic of second order as defined in eq. (6) of Part I. It 
has been shown in Part II that this covers the case of any statistical function 
of which the first but not the second derivative at the critical point vanishes. 

Independently of what was said before, the problem can be stated in the fol- 
lowing way. Given a function y(z, y) and a sequence of cumulative distribu- 
tion functions Vi(x), Vo(x), V(x) ---. Let V,(x) be the arithmetical mean of 


Vi(x), Vo(x) , «+: , Va(x) and S,(x) the repartition of a sample 2, 22, +--+ , Zn 
drawn from the collective with the distribution element dV;(z;) dV2(z) , «+: , 
dV,(zn), that is: nS,(2) is the number of those of the observed values 
21,22, °** , 2n that are smaller than or equal tox. Then the quantity 


(1) j= [/ v(x, y) dT,(x) dT,(y), where T,(z) = S,(x) — V,(z) 


is determined by the observations 2; , z2, --- , Zn. We ask for the distribution 
of f at large values of n. 
Without loss of generality, the function ¥(x, y) can be supposed to be sym- 


metrical. If, in particular, y(z, y) = y(x)¥(y), the quantity f becomes the 
square of 


) [v@ ara) =1 S| ve) - [vo ave | 





332 R. V. MISES 


and its asymptotic distribution can be computed in the manner shown in the last 
section of Part I. Another example would be 


v(x, y) = g(a) (x Sy) 
g(y) (x 2 y). 


In this case, integration by parts shows that 


(4) Sua} = | g'@)TL0) ax 


where g’ is the derivative of g. This is the statistical function that takes the 
place of x’ in continuous problems. See Introduction eq. (3). 

Note that the “excess” 7',,(x) vanishes at x = + and that for sufficiently 
large x the increment d7’,(x) equals —dV,(x). Thus, conditions for the exist- 
ence of the integrals in (1), (2), (4), etc. can be expressed in terms of the given 
functions (2, y) and V,(z). 

We shall first study the special case that implies so-called discontinuous chance 
variables. In our terminology it is the function y(z, y) that has to be specified. 
Let I, , Je, --- , J, be k mutually exclusive one-dimensional intervals (or groups 
of intervals) and J;,4: their complement. Assume that y(2, y) has a constant 
value when z fallsin J, and y fallsin J,, (c, x = 1,2,--- ,k +1). Theincrements 
of S,(x), V»(x), T(x) in the interval J, will be called p,, p,. , & respectively. 
Clearly, np, is the number of observed values falling in 7, , np, is the expected 
number of such values, and n(p, — p,.) = né, the excess of observed over expected 
numbers. Note that the given distributions V,(x) determine increments p,, 
in the interval J, and that 


(3) 


1 
(5) D« — n (pie + P2« + er + a 


Since the sum of all £, must be zero we can replace & 4; by 

(6) me -—hm- & — os Hh. 

Thus, the integral (1) can now be written as a sum of k’ terms 
ical 

(7) FSn(a)} = Do Wukib 


like that introduced in the second eq. (8) of Part I. 

Our next task will be to find the asymptotic distribution of (7) which depends 
on the matrix y.., (4, x = 1,2, ---, *%), and on the succession of probability 
values p,,, (v = 1,2,3,°°-;« =1,2,---k). The matrixy,,ink variables 
will be supposed to be symmetrical. 


2. Characteristic function. We define our chance variable as 


(8) x =5/ 





DIFFERENTIABLE STATISTICAL FUNCTIONS 333 


All summations, here and in what follows, are to be extended from 1 to k if 
not otherwise indicated. If P,(2) is the c.d-f. of x, that is 


(9) Prob ‘3 f<¢ x} = P,(z) 


the characteristic function (c.f.) is defined by 
(10) Q,(u) = Efe™} = | e“ dP, (2). 


In order to compute Q, we assume that the quadratic form (8) is transformed, 
by a linear transformation, into a sum of squares. Using appropriate (in general 
complex) coefficients a,, one can write 


(11) r= Si tat: +b, me = Do Ou ke» 


(The form y,, is here supposed to be non-singular which, however, means no 
loss of generality). It will be seen later that explicit knowledge of the a., 
is‘not needed. 

Now, for any real or complex y, the identity holds: 


1 
(12) ec” = vy, = | ett dt. 
TT 


If we write v for ~/wi and replace in (12) successively y by 0\/n m,0/n™, °** 
we find 


(13) & = (2m) *? ff tee / exp [—3)0 & + oV/ nd 2céel dtidt, +++ dt, 
where 
(14) Donk = Dy tale, 2. = Do tuk, (x = 1,2,--- , k). 


Since the first exponential factor in the integrand is a constant with respect 
to the chance variable, the expected value of e*“’ is given by 
(15) Qa(u) = Efe} = @n)** ff --- [exp (3D AG. dt dh +++ de 
with 
(16) G, = E{ exp yV/n Lo zkdd}. 

In order to find G, we consider the following n collectives Ci, C2, ---, Cr 
with discontinuous, (k + 1)-valued distributions: In C, the label values are 
21,22, °°: » 2, and 241, with a4, = 0, their probabilities py , pe, +++ , Dyna. 
The c.f. of this distribution at the point —iv/+/n is 

k41 


(17) > Puen v*, 


k==1 











334 R. V. MISES 


If we multiply the n expressions (17) for y = 1,2, --- n the product will be— 
according to well-known rules of probability calculus—the c.f. for the distribu- 
tion of the swm of the n label components in the collective formed by combining 
Ci, Ce, +--+, Cnr. This sum is 


DNnp Kee 


and therefore, 


(18) E {exp — Di noute | = II | Pre emi | 


vol k=l 


Multiplying both sides of this equation by 


(19) exp | - Fi E wpe | = exp | - » i p> Pate | 


vl 


and using the abbreviation 


(20) zy = 2 Pox &« 
we arrive at 
(21) Gn = E{ exp p/n do El} = FiF2 +++ Fa 
with 
k+1 ; - 
(22) F, = 7. Prx ree la/n_ 


x=1 


This solves the problem: By inserting (21), (22) in (15) and carrying out the 
integration with respect to t:, f, --:, t one has expressed Q,(u) in terms 
of the given p,, and of the coefficients a,, which link the z, to the ¢t,. This ex- 
pression for Q,(u) holds for all n. 

We have still to show that the integral (15) exists, at least for small | u | or 
| v |, independently of the value of n. For this purpose we develop F, , as given 
in (22), in the neighborhood of vy = 0. At this point F, = 1 and the first deriva- 
tive vanishes by virtue of (20). We thus have 


2 k+l 


- ee = Let, ~ PO 
k=l 


with | 3, | <1. From the definition of z, in (14) it follows that the ratio | z, |/T 
with ; 

PHtt+ete-- +h 
has an upper bound depending on the a, only. On the other hand, according 


to (20), Z, is a weighted mean of the z, and, therefore, | z, — Z,| will not surpass 
twice the maximum | z, |: 


(25) |Z — 2%| < aT 


DIFFERENTIABLE STATISTICAL FUNCTIONS 335 


where a is a positive function of the coefficients a,, which, in turn, are deter- 
mined by the y,,. Introducing (25) in (23) we find 


| F, 





2 272 
<1 ES dileriva g eletictztin 
n 


and, finally, from (21): 
(26) | Ga | < elvltatrs - giulat 7? 


Thus it is seen that for 


] ° 
(27) Jul <on or 1—2e |u| 27 >0 


the integral (15) admits the upper bound 
(28) | Qa(u) | < ay ** ff --- feet dt, dle, dle = 9. 


It also follows that the contribution to Q,(u) from the region 7 > T) tends to 
zero With increasing 7’) , uniformly with respect to n and with respect to wu in 
the region | u| < 1/2a’. 


3. Asymptotic value of Q,(u). If the quantity F, introduced in (22) is con- 
sidered as a function of 2:/+/n, 22/*~/n, -++ , 2/*~/n, We may write 


k+1 
(29) F,(21, 22,°°* ,%) = DDn ee. 
Here, 2, is defined by (20) and, on the right-hand side, 2,4; is zero. These func- 
tions F,(z,, 2, °°: , 2) for vy = 1, 2,3, --- have all the properties required 
in Lemma C of Part I: At the point z; = z = --- = z = 0 one has F, = 1, 
the first derivatives are 


k+1 
oF, = UP, — UP» >» Pn = 0 
k=l 


02, 
and the second derivatives, (« ~ x), 


oF, . . k+41 : 
az. a Pr (I i Pr) — VUPn| Pu — Pu y Pu | = V Pull = Prd 
(30) oF, ‘ . k+1 , 
02, 02 aa Po. (— Pra) — UPn} Pr — Pow >» Pur | = —v Du Dix - 

oe Lenk 





The third derivatives are certainly bounded in any finite region of the z-space, 
and this means also in any finite region of the ¢-space. 

The matrix of the second derivatives except for the factor v’ is exactly that 
defined in eq. (19) of Part I: 


(31) PLO = Drbix — DoiDox 








336 R. V 


- MISES 


and the arithmetical means of the derivatives from the matrix in eq. (23) of 
Part I: 


(31’) S«> Fane, 


1 n 
vow ae i Pv Pox - 
nN v=1 MM vl 


Applying Lemma C we find 
2 2k , - 
(32) ; . ) ~ exp E Zz Puts a} 


=. _— po oes — a 
a= @.(F.,F-, >~/n 


This is valid in any finite ¢-region. Since it has been shown at the end of the 
foregoing section that, for small | v|, the outside contribution to the integral 


(15) converges uniformly (for all n) towards zero, we are allowed to introduce 
(32) in (15). Writing 


(33) Zz, P62, 2. = 7 Vict. - 








whereby Ye = D> Pry csr cee 
Au 










equation (15) becomes 


(34) Q,(u) ~ (Qe) *”? I eee [ex| - 4 > e+ Lui Veet | dt, dt. +++ dt,. 


Now, it is well known that if m,, is any positive definite matrix with the de- 
terminant | m,,|, then 


| 1 
(35) (2n)*” I on [ exp [—3 D> mat. td dt dtp --- dy = —=——=.. 
oe V | mx | 
This is likewise true if the matrix m,, , which we also call M, has the form M = 
M, — Mz where M, is positive definite, M2 arbitrary (complex) and | \ | suffi- 


ciently small. Thus, the integration formula (35) applies to (34) and the result 
is reached, for small | u |: 


1 
36 nu) ~ Ow = VSS with DA) = léx — Ayu |. 
(36) Qnr(u) ~ Q(u) Jawa ™ (\) = | Vix | 
If the a,, which transform the given quadric into a sum of squares are known, 
(36) with (33) supply the solution of our problem. 
The formula (36) is susceptible of several useful transformations. Let us 
write A for the matrix a,,, A’ for the transposed matrix, and V, P, P, I respec- 


tively for the matrices Yic, Pus, Yee, Cuxe Then, obviously 


(37) 






v = A’‘A, r = APA’, M=I1-—-wf!I. 





If we multiply M by A’ to the left and by A to the right, we obtain 
(38) 





A'’MA = A'IA — wi A'APA'A = WV — wi WPY. 









DIFFERENTIABLE STATISTICAL FUNCTIONS 337 


In this operation the determinant of M is multiplied by | ¥..|.. Thus D(A) 
can be written as 


| “ r i : , D 
(39) D(a) = | Yor — D7" | with Ta = a RN Py, Wur . 
| Vex | Ao 
Here, the knowledge of the a,, is no longer required. 
If the matrix (38) is multiplied twice by ¥*, the inverse of VY, we find ¥* — wiP 
and, therefore, 


(40) DA) = | wal X | we — APY |. 


As P is positive definite and * real, it follows that all roots of D(A)—- 

the “Eigenwerte” of [—are real numbers. Therefore, D~”*(ui) is a regular 

function along the real axis in the u-plane. Thus, (36) which was proved so 

far for small | w| only remains valid for all real values of u: The c.f. of the 

asymptotic distribution is represented by D~"*(ui) for all real u-values. 
Multiplying (38) only once by ¥* we obtain one of the two forms 


(41) I-wWP or I—wi Pw 
which lead to 


(42) D(a) = | 5x a ASi« | = | Six ar AS | ’ <= Do Vin Pax « 
“ 


Although this formula has been derived by means of ¥* it can be seen by con- 
tinuity considerations that it remains valid whatever the (symmetric) matrix 
vx is. The formula makes it clear that the asymptotic distribution of the 
quadric Dy,,¢.é is completely determined by the ‘“Eigenwerte”’ of the matrix 
S = WP. This bears out our second main theorem in Chapter II, as far as 
quartics of the form (8) are concerned. It will be seen in sec. 5 how (42) applies 
to the continuous case. 

We, finally, apply to (36) a transformation that is valid only if P has an inverse 
matrix P*. (As shown in Part I, sec. 4 this is not the case if the & intervals to 
which the subscripts 1, 2, --- , k refer cover the whole range of the variables 
%1, 22, °**,2n). Multiplying (41) by P* we find the matrix P* — wi¥ and 
thus 
(43) Dd) = | Pu| X | Pau — Wel. 


This is equivalent to 
(44) Q(u) = | Pres If nee [ exp [- IyP* EE, 
+ dui Wake) déidt: +++ dé. 


According to the definition of the characteristic function eq. (44) can be inter- 
preted as stating that 


(45) Pe f exp [—4=P%, £.&] 








338 R. V. MISES 









is the asymptotic probability density for the simultaneous occurrence of §, 
f, +++, &. The expression (45) can be arrived at by applying the Centra] 
Limit Theorem to the case of k independent chance variables. Since, how ever, 
P* does not exist in general, eq. (44) would not be a suitable point of departure 
for developing the theory that concerns us here. 
















4. Asymptotic value of P,(x), illustrations. The relationship between the 
c.f. and the c.d.f. of a distribution is well known and need not be discussed here 
in detail. We shall use, in this section, two aspects of this relationship only, 
First, the continuity theorem, first proved by G. Pélya [5], stating that if the 
c.f. Q,(u) tend towards a limiting function Q(u), the corresponding c.d.f. P,(z) 
tend towards the P(x) that corresponds to Q(u). Second, the additivity, ice, 
if Q(u) is of the form aQ’(u) + BQ”’(u) with a + B = 1, then P(z) is 
aP'(x) + BP"(x) with the P’(x), P’(x) corresponding to Q’(u) and Q’(u) 
respectively. The following three groups of examples will illustrate the applica- 
tion of the foregoing results. 

a) Let us first consider a function of two excess values é, , £2 only 









(46) 





oo 5 ja 5 (A&i + 2Béié + Cé) 


where the matrix V is given by Wy = A, Vx = Ya = B, V2 = C. The product 
matrix PY is 






















BPy + CP, 
BP» + CP» 


APy + BP, 
AP, + BP xx 
and the determinant of J — \P¥ 
(48) D(A) = 1 — NAPy + 2BPy» + CPx] + (AC — B’)(PuP2 — Pie). 


If \; , Ae are the two real roots of D(A) = 0, the asymptotic probability density 
of x will be 


(47) 

















dP(x) _ 


mo PR V-a0-3) 


We are particularly interested in the case that P is “‘complete,” i.e. a matrix 
with all horizontal and vertical sums vanishing. Then Py = Py = Px» = pyp., 
the last term in (48) cancels out and the only Eigenwert is \ = 
1/(A — 2B + C)pyp,. Here, instead of (49) we have 


dP(x) _ . e—“* du 1 e's" 
(50) dz Te ~ a Va 


This is, with respect to /|x| a Gauss distribution with the variance 
|A — 2B + C| pyp,/2. 
















In’ 


the 


DIFFERENTIABLE STATISTICAL FUNCTIONS 339 


If, in addition to the assumption that P is “‘complete” (i.e. in the present case 
that pa + ~2 = 1 for all v) the further assumption is made that the two inter- 
vals J, and J; cover the whole range of the original chance variables x , 2 , 
a3, °° , one would have also & + & = 0 and from (46) 


z= 5 (A — 2B + Ck. 


In this case, ~/| x | is a linear statistical function and the Central Limit Theorem 
leads to the same result as that expressed in (50). It is seen, however, from our 
derivation, that (50) holds under wider conditions: If px + p,2 = 1 for all », 
there may exist another interval 7; within the range of the chance variables 
%,22,%3,°°* so that & + & is not necessarily zero. 

The latter remark suggests the following general theorem: If f is a function 
of the k variables & , & , --- , & and g another such function but vanishing when 
itiée+--- + & = 0, then f and f+ g have the same asymptotic distribution 
provided that for each v the sum py + py + -*: + px = 1. In the case of 
quadrics this result is equivalent to the following matrix theorem: If P, ¥, A 
are symmetric matrices, P with all horizontal and vertical sums equal to zero, 
V arbitrary, and A of the form a,, = a, + a, then the two products 


(51) PY and PW+A) 

have the same characteristic roots.—This can be proved by the usual methods 

of matrix calculus. The matrix PA has all characteristic roots equal to zero.” 
b) In the definition of Karl Pearson‘s test function which is usually called 

x, it is presumed that a sample is drawn from the combination of n equal dis- 

tributions. In this case all P” are equal and coincide with P which then can 

simply be written P: 


(52) Pes = PS ix i PiP« . 
The chance variable we now consider will be 


_nm,_ mM go 
a eng- GEE 


Thus ¥.. = 6,./p, and the elements of PW are 


(PY). = } PiWux = On — P.- 
B 


The matrix J — \PW has the elements 
5.(1 — A) + Ap. 


If the kth column is subtracted from any one of the others, only two terms re- 
main, one equal to 1 — \ and one equal —(1 — ) in the last row. Thus, the 


* A proof of the matrix theorem has meanwhile been published by Alfred Brauer, Bull. 
Amer. Math. Soc., Vol. 53 (1947), pp. 605-607. 








340 R. V. MISES 


determinant D(A) includes (k — 1) times the factor (1 — \). On the other hand, 





D(A) is of degree (kK — 1) and has the absolute term 1. Therefore 
(54) Di) = (1 — d)*". 
This supplies the x’-distribution with (k — 1) “degrees of freedom” 
~ = &Pty) 1 — 
= — 2 eeaemeeiatesis, NI. giecateaiieedetaenepeeeiicsc as 2 7 —— 
55) Q(u) (1 ut) : € (2 => 6). 


dx r(é = iy" 
2 


Again, our result is slightly more general than that reached in the usual theory, 
It includes the case that in addition to the k intervals with the probabilities 
Pi, P2,°** 5 Pe (Whose sum is 1) there are other intervals with probability zero, 
On the dhe hand, if to x° a term of the form n>(a, + a,&.é, is added, this 
would not change the asymptotic distribution. 

One may ask for other quadratic functions of & , & , --- , & whose asymptotic 
distribution is given by (55). In particular, one might be interested in a generali- 
zation of x” for the case of unequal original distributions. The answer can easily 
be given by introducing the cofactors of order (k — 1) and of order (k — 2) of the 
determinant | P,,!. It was mentioned in sec. 4 of Part I that all cofactors of 
order (k — 1)—in the case of “complete” P—have the same value. It may be 
— by A. The cofactor corresponding to the lines :, x and the columns 

, H Will be denoted 7 Hho vy With TW = Oife = cord = up. 
2 the integrers 1, 2, ,k 















Then, if lis any one 


—_ 


(56) Wx = TL eset 5 


A Kel 















is one possible solution. 





In fact, the product PW has in this case the elements 
(PV). = du, for 


(57) =-—-1,“.=Ii«+l 
g = 0, k=l 


Qc l 
























The determinant of J — \PW is then seen to equal (1 — \)*". 
The solution (56), however, is unsymmetrical in the sense that it does not 

include any terms with ~,. A completely symmetrical solution in which all 

& play the same role is given by 

(58) 


e _ » a 
Wee — kA eli«l 


M4 


According to (57) the matrix PW now consists of terms (K — 1)/k in the prin- 
cipal diagonal and —1/k at all other places, that is 





(PY). = 6 se, 


(58’) ws 












~~ = mM 


DIFFERENTIABLE STATISTICAL FUNCTIONS 341 


In the same way as in the case of (53’) it can be seen that the determinant of 
I — \PW equals here (1 — d)"". The asymptotic distribution of \Wacdé, with 
the coefficients (58) is, therefore, the x‘-distribution with (k — 1) degrees of freedom. 

If the formula (58) is applied to the case of equal P” the corresponding 
quadric becomes 


beet _r,, (Le) 
co P. , k io P. . ° ; 


that is, x” + a term vanishing with £ + & + --- +&. One can easily modify 
(58) so that it leads to x” without any addition. 

c) A third group of examples where the asymptotic density is expressed by 
simple functions is that where D(A) is an exact square, that is, all characteristic 
roots (except the one that is zero) have even multiplicities. Let us assume k = 
2m + 1 and let A; , Ax, --- , Am be m double roots. Then 


(59) Qu) = 1 (1 - ay « ate 


Ay u=1 Au or. ur 


with 
Sis 
(59) in (1 a ss) 
tu Ay 
and therefore 


(60) — = x Ayer”, x 
Assume, for instance, that all original distributions are uniform, that is 
PY = Pa =the 5 
and that the quadric f is given in the form (11) with the following a,, : 
a. = Vke fore = 1 
= Vke. 7 e>1«=1,2,---,6-1 
=-(-lDvVkn "e>lk=e 
= 0 ™e>l«=etilet 2,-°-,k. 
Then, the y,, as defined in (33) become 


(61) 


Yu= cule — 16, forcork >1 
(62) 


= Q “t=«x«=]1 


and D(A) according to (36) takes the form 


(63)  — I fl — rae — Di. 








342 R. V. MISES 









In other terms, for the quadric 
f = key(és + +++ + Ee)” + heo(E: — &)” + healt: + & — 26)? + ++: 
+ keelfi + & + 0+ + a — (k — 1ef 


the characteristic \-values are 1/eu(e — 1). 
Now, to obtain the case of m double roots with k = 2m + 1 we have simply 
to choose 


Co = 3c, , 3C, = 5c5 , 5c = 707, °**. 





The first term on the right-hand side can be entirely omitted in accordance to 

what was said in connection with (51). Besides, for the same reason, the ex- 

pression can be simplified in various ways by assuming &, + &+ --- + & = 0. 
As a numerical example, take k = 5, c = 3,c3 = 1c =5,c; = 3. Then 


f = 208 + E+ & + 20 €i + 20 5 — ite — bets — Esti + 10 Eaks) 








leads to the characteristic values \ = 1/6 and 1/60 and the asymptotic density 
becomes 







dP _ 1 , -zj0 —2/6 

a = 54 (e é ® 

In a similar way other groups of quadrics with asymptotic distributions of 
the type (60) can easily be constructed. One may, for instance, use eq. (41) 
and make vanish, in the matrix S = PY, all elements on one side of the diagonal 
so that the roots are immediately known. 












5. Transition to the continuous case. In this concluding section, the transi- 
tion to the case of a quadric of the form (1) with continuous y (xz, y) will be 
outlined. The formula best fit for this purpose is eq. (36). We therefore 
suppose the statistical function f given as 


4) f= [[v@, var ary) with ve,» = | alr, 2alr, y) dr. 


In analogy to (33) we derive 
vay) = [[ ale, daly, ) aa(s, 1) 
(65) 
= | a(x, s)a(y, s) dV,(s) — II a(x, s)a(y, t) dW,(s, ¢). 


Since dW is symmetric, this function y(x, y) is symmetric with respect to x and 
y. If D(A) denotes the Fredholm determinant of the ‘‘kernel’’ y(x, y), we con- 










DIFFERENTIABLE STATISTICAL FUNCTIONS 343 


clude from (36) that the characteristic function of the asymptotic distribution 
of f will be given by 


] 
(66) Qniu) ~ Diui) 
if certain convergence conditions are satisfied. 

In order to establish (66) the main point is to find a sequence of functions 
W(x, y), Woz, y), ++ each of the type considered in the foregoing Sections and 
such that 1) the distribution of the quadric f;, with the coefficients y;, tends to- 
wards the distribution of f with increasing k and independently of n; and 2) that 
the determinants D; corresponding to y; converge towards D as k increases in- 
definitely. Using our Lemma A we can replace the first condition by asking 
that the expectation of | f — f; | should go to zero with k + independently of n. 

The following assumptions shall be made concerning f and the V,(x): The 
function a(7, x) in (64) is continuous and bounded in every finite region; there 
exist two positive continuous functions a(r), B(x) such that 


(67) | a(r, x) | S a(r)B(2) 
and that the integrals 


(68) [é@a=mM, [a@av@, | #@ ave) 
exist, the latter two being bounded and converging uniformly with respect to 


v. We are going to devise a step function ¥;(z, y) so that for the corresponding 
f, and any positive & 


(69) Et \f—fel} Sa. 
Let N be an upper bound of the integrals 


(70) [s@av.e@ sy, [a@a¥@) <N 


and e = «,/(65 + 8 N). Choose a value L such that 


(71) / a(x) dV,(x) < * [ee dV(2) < £ 


|jz|>L M’ M 


and, calling B the maximum of @(z) in|z| < L, another quantity R such that 


— 2 € 
72) Ie (r) dr < 2B?" 


We subdivide, in the z-y-r-space, the domain|az| < L,|y| < L,|r| < Rin 
k° equal cells where k is determined by the condition that the absolute value 
of the variation of a(r, x)a(r, y) within each cell does not exceed €/4R. Outside 
this domain we set ¥;(7, x) = O while inside the domain a;,(r, x)a;(r, y) shall 





344 R. V. MISES 


equal the value that a(r, x)a(r, y) assumes in the center of the respective cell. 
Then (x, y) will be defined by 


(73) dle, 9) = / ee 


From the definition of k and from (67) and (72) it follows that 


ite.) — ale, 01 I, ale, dele ~ ale, dade Nae 


(74) os |a(r, x)a(r, y)| dr 
|r| >R 


€ 2 € a 
S 2R = + (8) [oe drS,+B a= 


as long as|az|< TL, |y| < L. If this square is called (LZ) and the comple- 
mentary region (L) we have 


f-fe=[[ We, — nl, 9] at.@ aT.) 
(75) (L) 
+ is v(x, y) dT,(x) dT,(y) 


and since the integral of | d7’,(x) dT,(y) | is not larger than 4, while, according 
to (64) and (67) 


(76) |v(e, v) | S 8(2)8) f or) dr = Ma(x)a@) 
we conclude from (74) and (75) 
(77) f-fhls4e+M]{ B@eW \a7.@ aT. |. 
(L) 
This gives 
Gs) HIS fel Ste + Mf] s@~QEl\dT.@) a7.) |). 


Now, from | d7,,| = | dS, = dV,| < dT, + 2dV, and from the formulas 
derived in Part II, 


E{ dT,(z)} =0,  E{dT,(x) dT,(y)} = ~ dale, y) 
it follows 


(79) E{ | aT,(x) dT,(y) |} < ~ dala, y) + 4dV,(x) dVa(y) 





DIFFERENTIABLE STATISTICAL FUNCTIONS 
with 
(79’) dU,(a, y) = 6(x, y) dV,(x) — dW,(x, y) S 6(2, y) dV,(z). 


If this is introduced in (78) and (71) taken into account, we find 


E\\f—fel) S4e+M*[ pte) AV (x) 
N Yiz|>L 


+m | I RC OLLACELAC) 


S det e+ 4X Ne S64 8Ne= 4 


as required in (69). 
On the other hand, it can be seen that the kernel y(x, y) as defined in (65) 
is the limit of the sequence y;(z, y) 


v(x, y) = II ax (x, s)axz(y, t) dUa(s, 0) for x, y in (R) 
(81) (L) 
= 0 for x, y in (R) 


where (R) means the region|z| < R,|y| < R and (R) the complementary 
region. In fact, from the definition of k and eqs. (67) and (71) one has for z, y 
in (R): 


| € FF ' 
| y(a, y) is viz, y) | < 4R IT. | dU,(s, t) | 


— IT | a(x, slay, t) dU,(s, t) | 


St awaw| f spy Bh aPC) 


ro} > I, B(s)8(0 aV.(s) aV,(t) 


N val 
€ , « . 
s oP + a(x)a(y) iM (1 + 2N). 


Since a(x) is bounded, the right-hand side goes to zero with e. Finally, for 
z, y in (R) we have 


ir, a) — we, sf f lale, daly, ans, 0 | 
(83) a(x)a(y) il 87(s) dV,(s) 


dp > If B(s)B(t) dV,(s) av.(.| 





346 R. V. MISES 


Here, the two terms in the brackets are bounded, but a(x)a(y) goes to zero as R 
increases. The conclusion is that y:(z, y) tends uniformly towards y(z, y) 
with k — x, 

Thus, eq. (66) is established provided that the function (x, y) defined in (65) 
has a Fredholm determinant D(d) that is the limit of the corresponding alge- 
braic determinants and provided that the c.f. ~/1/D(uz) leads to a c.d.f. with 
bounded derivative. 

As an example let us consider the case 


a(r, xz) = Vq/(r) forr = x 
= 0 “<2. 


(84) 


This function is not continuous as it was assumed in establishing (66). How- 
ever, the existence of a single discontinuity line, x = r, does not invalidate the 
argument. We assume g’(r) = 0 and equal to dg/dr. Then, in the case of 
(84): 


(85) V(z,y) = [ ar, xja(r, y)dr = — gly) forx sy 


“ce 


= — g(x) x2 y. 


Since, however, adding to y a function of x or of y alone does not change the 
value of f, we can also use 


V(z,y) = g(x) forx sy 
(85’) 
gy) “a«xey. 


The statistical function f that corresponds to (84) can be computed either from 
(85) or (85’)—or directly from (84) if we use the formula that follows from (64) 


(86) p= [| f ete, 2) atc | ar. 


The integral in the brackets is, in our case, seen to equal ~/g’(r) T(r), thus 
(86’) f= [ ORM — PPar. 
This is exactly the test function w” mentioned in the Introduction, eq. (3). 


To find the distribution of f we have to compute y(x, y). Its definition (65) 
can be written in the form 


(87) y(2z,y) = = >| fate s)a(y, s) dV,(s) — [ a(x, s) dV,(s) [oty, s) av. |. 


This supplies in the case of (84) 
(x,y) = V9'(x)g'(y)lVa(a) — V.(z)Vn(y)] for « S 


(88) lita gil / 
V9 (ay YValy) — Valz)Valy)] “ 2 





DIFFERENTIABLE STATISTICAL FUNCTIONS 347 


Here, the second term in the brackets is the arithmetical mean of the products 
V(x) Vy). 

If the distributions V,(x) are all equal (independent of v) we have simply to 
write V(x) instead of V,(x) and V(x)V(y) instead of V,(z)V,(y). If, in addi- 
tion, the distribution in the original collectives are uniform in the basic interval 
0 to 1, one has 


y(z,y) = Vg(a)g(y) zt (l—y)forO<axSy1 
Vo(agyy)yl-—2) “OSyS2S1. 
This is the case dealt with in Smirnoff’s papers [7, 8]. If, finally, g’(x) is sup- 
posed to be equal to 1 in the interval 0, 1, we arrive at a kernel y(z, y) whose 
Fredholm determinant is well known: 
y(z,y) =az(l—y) for «sy sin ~W/X 
(90) Z D(a) = -. 
= y(1 — 2) 22 #. Vi 
This supplies immediately the c.f. and (in form of a definite integral) the c.d.f. 
of the asymptotic distribution of w for g’ = 1. 
The same result can be reached without the use of a(r, x) if we apply one of 
the transformations discussed in the foregoing Section. Take, for instance, 
instead of y(x, y) the unsymmetric kernel o(x, y) corresponding to the matrix 


S = Pw defined in (41). If all original distributions are equal, the element of 
S can be written as 


(91) Ca = Z Pubue = Dire = De Vax Pp) 


Calling u(x) the density dV(x)/dx in the continuous case, the corresponding 
kernel becomes 


(2) o(e, w) = v(2)| v6e, v) — [ 4s, wots as 
With the p-values from (85’), g’ = 1, v = 1, this gives 


(89) 


o(z,y) = 2—y + 5 for rsy 


(92') 


“cc 


= 2 & #. 


It can easily be seen that the ‘“Eigenfunctions” of this o(z, y) are sin(+/X,, 2) 
with \,, = m’z’, and, therefore, the Fredholm determinant is that indicated in 
(90). 

It might be added that the expectation and the asymptotic variance of w’ 
can be computed, independently of the distribution, from the formulas de- 
veloped in Part I. The results are 


(93) nE(s} = | 9 @)Va@ — Vata) ax 





348 R. V. MISES 


and, in the case of all V,(x) equal 


(94) n’Variw} ~ 4 | / g’ (x)g'(y) V7(x){1 — V(y)}? dx dy. 


These formulas have already been given in [4]. 

Another, more general, remark is this. If all V,(x) are equal, one can reduce 
the problem, by a transformation of the original chance variable 2 into z’ = 
V(x), to the case of a uniform distribution over the interval 0 to 1. If the V,(z) 
are not equal, it might still be possible to find a transformation x’ = 2’(x) such 
that all original distributions extend over a finite region on the 2’-axis only. 
In this case the restrictions concerning the behavior of the distributions at 
infinity drop out. 


REFERENCES 

[1] Haratp Cramb&r, ‘“‘On the composition of elementary errors,’’ Skand. Aktuarietids- 
drift, Vol. 11 (1928), pp. 13-74, 141-180. 

[2] R. v. Misss, ‘‘Les lois de probabilité pour les fonctions statistiques,’ Ann. de l’ Inst. 
Henri Poincaré, Vol. 6 (1936), pp. 185-212. 

[3] —— , ‘Sur les fonctions statistiques,’”’ Soc. math. de France, Conférence de la 
Réunion internat. des Mathématiciens, Paris, 1937. 

[4] —, Wahrscheinlichkeitsrechnung und ihre Anwendung, Leipzig and Wien, 1931. 

[5] G. Pérya, “Uber den zentralen Grenzwertsatz der Wahrscheinlichkeitsrechnung,’’ 
Math. Zeitschr., Vol. 8 (1920), pp. 171-181. 

[6] T. A. SHonat anp T. D. Tamarkin, The Problem of Moments, Math. Surveys No. 1, 
New York, 1948. 

[7] N. V. Smirnorr, ‘“‘On the distribution of the w*-criterion of Mises,’’ (In Russian), 
Recueil Math., nouvelle série, Vol. 2 (1937), pp. 973-993. 

[8] ————-, ‘“‘Sur la distribution de w? (criterium de M. von Mises),’’ Comptes Rendus 
Paris, Vol. 202 (1936), p. 449. 

[9] Viro VoLreRRA. Lecons sur les Fonctions de Ligne, Paris, 1913. 

[10] Viro VOLTERRA AND JOSEPH P&REs, Théorie générale des Fonctionelles, Paris, 1936. 





APPROXIMATE SOLUTIONS FOR MEANS AND VARIANCES IN A 
CERTAIN CLASS OF BOX PROBLEMS 


By Pamip J. McCarruy 


Social Science Research Council 


1. Summary. Consider n boxes, each box having an associated probability, 
Di, CO, pi = 1), and an associated integer, k;. If balls are thrown one by one 


into these boxes, the probability being p; that any one ball falls into the 7th box, 
then the number of balls which must be thrown in order to obtain, for the first 
time, at least k;, balls in the 7;th box, at least k;, balls in the 72th box, --- , and at 
least k;, balls in the z,th box, is a random variable, N.[ki(p:), ke(pe), --- , kn(pn)). 
Here 71 , 22, *** , 2% represent the numbers of that set of s boxes, (1 < s < n), 
which first satisfies the stated condition. 

The distribution of N.[ki(p1), ke(pe), --- , kn(pa)] can be written down for any 
set of values assigned to n, s, the p,’s and the k,’s. However, for n greater than 
2 the distribution assumes such an extremely complicated multinomial form 
that except for certain special cases even the mean of the distribution cannot 
be numerically evaluated without a prohibitive amount of labor. 

This paper presents the exact moments of Ni[ki(p:), ke(pe2)] and Ne[ki(pi), ke(po)] 
in forms that readily lend themselves to computation and shows how these 
moments can be used to obtain approximate values for the mean and variance 
for certain situations where n is greater than two. These approximation formu- 
lae are given for 

1. The mean and variance, for any n and any set of k;’s and p,’s when s = 1 
or n. 

2. The mean, for any n and 2 < s < n—1, when pj = 

(¢ = 1, 2, --- , n). 
Some indications are given concerning the error of the approximations, and the 
circumstances which lead to a minimum (and maximum) error. Curves have 
been prepared to show the mean for the two box case, the primary function 
of these curves being to assist in the application of the approximation formulae. 
Some problems where the results of this paper might be applicable are suggested 
in the Introduction. 


2. Introduction. A box problem is defined when one is given a fixed number 
of boxes, a collection of balls (either finite or infinite), a set of rules governing 
the throwing of the balls into the boxes and a statement of the conditions which 
will bring the throwing to an end. The terminating conditions usually state 
either that a fixed number of balls will be thrown or that balls will be thrown 
until a particular distribution of balls in the boxes has been obtained. In the 
first of these, interest is centered on the possible distributions which can be ob- 

349 





350 PHILIP J. MCCARTHY 


tained, while in the latter the number of balls necessary to obtain a specified 
distribution is of primary interest. 

This paper will be concerned with certain problems falling in the latter cate. 
gory. In the simplest case one is given two boxes with associated probabilities 
pi and pe and associated integers k; and k,. Balls are thrown one by one into 
the two boxes, the probability being p; that any one ball goes in the first box and 
pe that it goes in the second box. This process is stopped when either k; balls 
fall in box 1 or ke balls in box 2, whichever occurs first. One is interested in the 
distribution of the number of balls necessary to terminate the throwing. This 
problem was stated in essentially this form by Laplace [4], but he contented 
himself with merely writing down the probability generating function. 

Here the special case of two boxes will be treated in detail and the results 
will then be generalized to the n-box case. In all of these instances it is pos- 
sible to write down exact expressions for the mean and variance of the number of 
balls required to achieve the stated distribution. However, in almost every 
case the resulting expressions are too complicated to be of any use when a numer- 
ical answer is desired. The principal portion of this paper will be devoted to 
obtaining approximate formulae from which numerical answers can be obtained 
for these problems. Some evaluation of the degree of approximation will be 
given in section 5, while curves to facilitate the computation will be given in 
section 6. 

The statement of these problems in terms of boxes and balls may lead one to 
the belief that they have no other interpretation. Actually this is not the case, 
and a few illustrations of this point will now be given. For example, consider 
the curtailed single sampling plan used in acceptance sampling. A buyer re- 
ceives a lot of articles. This lot will contain a certain proportion of defective 
items. The buyer wishes to determine on the basis of sampling whether to 
accept or reject the lot. His knowledge of his own situation will allow him to 
specify the largest proportion of defectives which he is ordinarily willing to 
accept and the risk he is willing to take of accepting a lot with a proportion de- 
fective larger than this critical proportion. On the basis of these two values it 
is possible to set up a sampling plan in which the buyer will take a sample of size 
n out of the lot, inspect it, and reject it if there are k,; or more defectives in the 
sample. Of course once he has obtained k; defectives there is no need to inspect 
the remainder of the sample. The lot will then be automatically rejected. 
Similarly, once he has obtairied n —k, non-defectives, he can accept the lot with- 
out inspecting the remainder of the items. The average number of items which 
he must inspect in order to reach a decision is given by the solution to the two 
box problem stated above. Box 1 will receive the defective items, the asso- 
ciated integer being k, and the associated probability being p: , the true propor- 
tion of defectives in the lot. Box 2 will receive the non-defective items, the 
associated integer being n—k, and the associated probability being pe , the true 
proportion of non-defectives in the lot. 

Laplace [4] considered problems of this type as applied to games of chance. 





BOX PROBLEMS 351 


Thus suppose there are two players A and B who participate in successive trials 
of a given event, the probability being p,; that A wins on any one trial and p» 
that B wins. Then one can associate the integer k; with A and k, with B by 
saying that A wins the match if he wins k, trials before B wins k, trials and con- 
versely. The analysis is exactly the same as for the two box problem. It is 
apparent that this same situation can be extended to any number of players. 

Another possible interpretation is as a particular kind of random walk prob- 
lem. Let a particle start at the origin of a system of rectangular coordinates 
and suffer successive positive unit displacements, the probability being p; that 
it moves one unit in the x-direction and p. that it moves one unit in the y- 
direction. Furthermore assume that it is absorbed if it ever reaches the line 
a = k, or the line y = k.. Then the analysis of the above two box problem 
gives the mean number of displacements before it is absorbed. In the same 
manner, such a random walk problem can be stated for n dimensions. For n 
equal to three, there will be three planes and the particle will be absorbed when 
it reaches any one of the three. 

Certain problems in public opinion polling may fit into this category of box 
problems, particularly if the above problem is rephrased so that one requires 
the mean number of trials to obtain at least k, balls in the first box and at least 
kz balls in the second box, for the first time. For example, suppose that one 
desires to sample from a population composed of two types of individuals, 
A and B. Let the population proportions of A and B be known and be de- 
noted by p; and p.. Then if one wishes to obtain at least k, individuals of type 
A and at least kz individuals of type B, the average number of persons who must 
be chosen in order to fulfill this condition is given by the analysis of the cor- 
responding box problem. This is rather artificial when there are only two cate- 
gories and p; + p.2 = 1. However, these restrictions will be removed in the 
course of the paper, and the problem will be considered for any number of types 
of individuals. 

As a final example, consider one of the many bombing problems which arose 
during the course of war research. Suppose that a factory which is to be de- 
molished has n vital units, the destruction of any one of which will destroy the 
usefulness of the factory. Let the probability be p, of hitting the first unit with 
a single bomb, p.2 the probability of hitting the second with a single bomb, etc., 
and assume that k, bomb hits will finish off the first unit, k. , the second, ete. 
Then the mean number of bombs required will be given by the analysis for the 
corresponding box problem. 

Corresponding interpretations are possible for the other problems which are 
to be considered in this paper. Some of these will be indicated as the analysis 
proceeds and it is to be hoped that others will occur to the reader. 

As previously noted, this paper will be concerned with the distribution of balls 
necessary to terminate the throwing, assuming the p’s are known. Another 
possible interpretation is to assume the p’s unknown and to estimate them with 
the results of the ball throwing. Certain aspects of this problem for two boxes 





352 PHILIP J. McCARTHY 


have been considered by J. B. S. Haldane [3] and Girshick, Mosteller and Savage 
(2). 


3. Solution for the two box case. 


3.1. Distribution and moments of the number of trials necessary to obtain either 
ky balls in the first box or kp balls in the second box. ‘This problem may be stated 
as follows: Suppose one is given two boxes with associated probabilities p; and 
pz, and associated integers k, and k,. For the present it will be assumed that 
~i + po = 1, although this restriction will be removed later. Now let balls be 
thrown one by one into these two boxes, the probability being p; that a particular 
ball will fall in the first box and p. that it will fall in the second box. This 
process is stopped on the first ball which leaves either k; balls in the first box or 
ke balls in the second box. The number of balls, x, which is required to accom- 
plish this is a random variable and we desire the moments of x. ‘The probability 
that k; balls are obtained in the first box on the zth throw,ki; < x <hi+hkh—-1, 
before ke balls are obtained in the second box, is immediately seen to be 


a=] ky-l aki | _ f & 1 ky 2—ky 
on [(g rN) amar }on = (Grote 


Similar reasoning gives the probability that kz balls are obtained in the second 
box for the first time on the zth throw, kx < « < kj + ke — 1, as 


(3.2) : - ) pr pit, 


From (3.1) and (3.2), the hth moment of x, E(z"), is 


ky+ke2—-1 — 1 _ ky+ke2-1 s- 1 a 
(3.3) » a ( ) oi “sa 2 (; ) vf sys? . 
z=ky myo 3 z=ke ko — 1 

However, it is inconvenient to consider (3.3) directly. A much simpler pro- 
cedure is to determine the increasing factorial moments of x and then transform 
these into the ordinary moments. Thus the Ath increasing factorial moment of 
x, Fralki(pi), ke(pe)], is defined as E[z(x + 1) --- (a +h —1)]. Then Faal | 
is equal to 


ky+ke—1 
etan ET} ky 2—ky 
(3.4) = Gop ki 1)? 


_ 


me Skeet Dene. 
b~ (x —_ 1)! ke 1 


(3.4) can be transformed by means of the relationship 


= 


(3.5) a (‘ ‘7 pi = (1 — p) “1k + 1a + 0), 





BOX PROBLEMS 353 


where I,(p, q) is the Incomplete Beta-Function as tabulated by Karl Pearson 
[6], and the result is obtained that 


kiki + 1) --- (fi +h — 1) 
pt 
ko(ke + 1) «++ (ko +h — 1) 
cme: Jens 
P2 
The ordinary hth moment of x may be written in terms of F,,[ ], Foi{ J, ++ 
Fal ] as 


Falki(pi), ko(po)] = Tp, (ki + h, ke) 


(3.6) 
Trg (ke + h, ky). 


G7) E@) =D Fal AP (-p, 


where A‘0" represents a difference of zero. Tabular values of A‘0"/i! are given 
by Fisher and Yates [1]. 

In particular, the mean and variance of x, which will receive the special desig- 
nations E,[ky(p:), k2(pe)] and oj[ki(p1), ke(p2)] respectively, are 


k ke 
(3.8) — Iy,(ki + 1, ke) + — Tpg(ko + 1, ka) 
Pi Pe 


and 


kik 1 ko(k 1 
bas FD) 70. +2, be) + SEY 7, te + 2, by) 
(3.9) Pi ” 


_ E,lki(pi), ke(po)] = { Exlki(p.), ke(pe))}? e 


In the event the p’s are equal and sum to one, E,[ki(p1), ke(pe)] will be abbreviated 
to E,{k; , ke], and finally, if both the p’s and k’s are equal, it will be written as 
E,|k’|. In'this two box situation, the only other possibility is E.[k,(pi), ke(ps)], 
which will denote the expected number of balls required to obtain at least k; 
in the first box and at least k2 in the second box, for the first time. This problem 
will be considered in section 3.2. 

In order to facilitate the computation of mean values, both for the two box 
problem itself and for its application to problems involving a larger number of 
boxes, (3.8) has been graphed for various values of ki , ke, pi and pe. A dis- 
cussion of this procedure and the results obtained will be found in section 6. 

There is one further result which will later prove useful. Consider the situa- 
tion when there is only one box with p; and ki, p: < 1. This is the same as 
having two boxes where the k, corresponding to the second box is infinite. In 
other words, one can terminate the throwing of balls only because of what hap- 
pens to the first box, never because of anything that happens to the second box. 
In this case one obtains 


= z-1l — 
(3.10) E,lki(pi), © (p2)] = ze, “(; = ') pit P2 "= = ° 
r= 1 Pi 





354 PHILIP J. MCCARTHY 


Similarly, 


9 k 2 
(3.11) oilki(ps), © (p2)] = at 


3.2. Distribution and moments of the number of throws necessary to obtain at 
least k; balls in the first box and at least kz balls in the second box. This problem 
may be stated as follows: Suppose there are two boxes with associated probabil- 
ities p, and p., and associated integers k; and k,. As in 3.1, pr + me = 1. 
Let balls be thrown into the boxes one by one, the probability being p; that a 
particular ball will fall in the first box and p» that it will fall in the second box. 
This process is stopped on the first ball which leaves at least k,; in the first box 
and exactly k2 in the second or at least ke in the second and exactly k; in the first, 
Again x is the number of balls required to accomplish this. As explained in 
3.1, the mean value in this case will be written as E2[ki(p:), ke(pe)]. The analysis 
follows through as in 3.1 and the mean number of trials is equal to 


= x—1\ 4, 2 < t—1\ we k 


z=k)+ko task) +ke 
Making use of (3.5), this can be written as 
ka 
Pi 


Referring to (3.8) it is evident that 


(3.13) 


[1 — Ip,(ky + 1, ka) + 2 Ul — In(ke + 1, iy]. 


k 1 


(3.14) E,lki(p1), ke(p2)] + Eslki(pi), ke(pe)] = 1 . - 


The Ath increasing factorial moment in this problem, denoted by F;,2[ki(p), 
ka(pe)}, is 
kiki + 1) -+- (ta +h — 1) 


h 


Pi 





[1 — Ip,(ki + h, ke)] 


(3.15) 
ko(k 1) «06 & h-1 





Comparison of (3.15) with (3.6) gives the relationship 


buh, + 1) + tin +h = 1 
Firal | + Frol ] = —— 
(3.16) 





+ ko(ke + 1)- -— +h—- 1) 


The ordinary moments of x can be computed from (3.15) by the use of (3.7). 
That is, formula (3.7) holds in this case if F;,:{ ]is replaced by F;,2[  ]. 





BOX PROBLEMS 355 


It can be easily shown by the use of the recursion relationship for the Incom- 
plete Beta-Function, 


I.(p, q) = xI.(p — 1,9) + (1 — a)I.(p, q — 0), 
that Final ] and F;.[ ] satisfy the partial difference equation 
Fy, s[ki(p:), ke(p2)] = RF r—-a,ilka(pi), ke(p2)] 
(3.17) + piP nsl(ki — 1)(p:), ke(pe)] 
+ peF r,ilki(pi), (ke — 1)(p2)), 


where? = lor2. This equation can be used as an alternative way of obtaining 
many results, examples of which are (3.10) and (3.11). Certain of these appli- 
cations have been discussed by McCarthy [5]. 


4. Solution for the n box case. 


4.1. Preliminary discussion. The problems of this section, although direct 
generalizations of the two box cases, can perhaps be most easily stated and 
illustrated as applied to the behavior of a random particle. Suppose that 
we have a random particle which starts at the origin of n-dimensional rectangular 
coordinates and moves in unit steps along the positive coordinate axes. At 
any given point the probability will be taken as p; that it moves in the 2;-direc- 


tion. >. p; is assumed to be one unless otherwise specified. Now consider the 
i=1 


n hyperplanes, x; = k; , and assume that the particle will be absorbed if it passes 
through a specified number, say s, of these hyperplanes. Notice that we are 
interested only in the number of planes which it passes through, and not in the 
particular ones. For each s, (s = 1, 2, --- , m), the number of moves which the 
particle makes before it is absorbed is a random variable, and in this section we 
will be concerned with the distribution of this random variable. The cor- 
responding interpretations for boxes and balls is immediately obvious. 

These problems are seen to be generalizations of the two box cases considered 
in section 3. Although it is always relatively easy to write down formal ex- 
pressions for the quantities to be considered, the step from two boxes to three 
or more boxes produces expressions which are extremely difficult, or even im- 
possible, to evaluate. In this section we shall develop approximate solutions 
which make use only of simple computations based on the solution for the two 
box case. 

As an introduction to the contents of this section, we shall discuss briefly a 
box problem which is a special case of the general problem. Assume that there 
are n boxes with a probability of 1/n that any one ball will be thrown into a 
particular one of the n boxes. Then one can ask for the mean and variance of 





356 PHILIP J. McCARTHY 


the number of trials required to obtain s occupied boxes (i.e. ky = ke 
= 1). Making use of (3.10) and (3.11), we obtain 


£,{1"] = 1 


an=ieah(s)-Q)-42 
n n n-—1 


i =a1+— + rl a(” oll : 
n-—1 1 n 


n 


BA") = 1 + ere + 


oi[1"] = 0 


o3/1"] = 0 + oi E " _ *)s 0 a) 
n n 


n 


ill"] = 0 + 7 aap toi ( 


i—Ip * m2) 


init n 2n 
1 28 tele + oe 

(s—1)n _ = t 
(n—s+1)? — ™ f=1 (n — 1)?" 
The solution for this problem for s = n is given in Uspensky [9], but a straight- 
forward solution requires a great deal of formal manipulation. The step-by- 
step procedure used here is somewhat indicative of the methods to be used in the 
succeeding portions of this paper. 

4.2. Mean and variance of the number of trials required to obtain either k, balls 
in the first box, or ke in the second, --- , or kn in the (n — 1)st, the probability 
associated with the nth box being non-zero. The mean number of trials in this 


+ eee + 





BOX PROBLEMS 
particular problem i is represented by E,[ki(g), -+* , Knai(Dn-1), © (pn)]. 
formal expression for. this quantity is 
n—1 © Gj a 1)! 
2 2G mG - 
Bias 


ss 


(4.3) 


v1 ri—1 7 é+1, 
pi’: *Pi-1 Pit’ 
Tea! Terr foe tn ! 


where the third sum is taken over all values of the r’s such that 
Ate +trattiat::: +m =j— k; 
and 
T1 < hi, ee? Tina < Kear, Tits < Kear, °°, Tana < heat. 
This expression can be reduced by one dimension by the application of some of 
the results for two boxes. Consider for the moment only those balls going into 
the first (n — 1) boxes. Then the number of balls (conditional) which is neces- 


sary to obtain either k; in the first box, or k, in the second, --- , or k,_; in the 
(n — 1)st box is a random variable X which takes on values 


kag ki t1,-ss ke the t+ +++ + ha — (n — 2) 


with corresponding probabilities +;, where with no loss of generality it is as- 
sumed that ki < ke < +--+ Ska. 2; is given by asum of (n — 1) multinomial 


expressions, the probability associated with the 7th box now being p; / (= ps) , 
t=1 


which will be designated by p’;. 
Under these circumstances it is apparent that 


(4.4) Exlki(pi),-++> ,kn-1(Pn—-1), © (pn)] = a a; E,{x;(pi + +++ + pai), © (pnd). 


However, (3.10) can be applied to each term in (4.4), leading to 
1 
tpt +p err 
Now from the definition of 7; and x; we have 
E,{ki(pi), +++ , kn-1(Pn-1), © (prn)] 

(4.6) 1 

~ (pr + po + +++ + De) 
Similarly, the application of (3.11) gives the result that 


(4.5) 


E,(ki (pi), ke(p>), lal kn-1(Pn-il). 


oilki (pi), -+* yKn-1(Pr-i), © (pn)j 


" ers E,lki(pi), re kn—(Dn—1)]. 


(4. 





358 PHILIP J. MCCARTHY 


. These results are of immediate importance for two reasons: 

1. They indicate that by combining boxes and introducing a new random 
variable, certain problems can be simplified. This statement will be expanded 
and the principle applied repeatedly in the later portions of this paper. 

2. With respect to the section on two boxes, they mean that the restriction 
pi + p2 = 1 is not necessary for the solution of the problems. One can always 
assume that p3(= 1 — pi — pe) refers to a box which receives balls but which 
otherwise has no effect on the outcome of an experiment. In this paper it has 
been convenient to refer to such a box as having an infinite capacity. 

4.3. The mean value and variance of the number of trials required in a two box 
problem when one or both of the constants k, and kz are replaced by random variables. 
The discussion in 4.2 has indicated that the idea of associating a random variable 
with a box instead of a single integer may sometimes lead to simplification. 
Here this procedure will be treated in more detail. Consider E,[ki(p1), ke(p2‘) 
and assume that k; is replaced by a random variable X which can take on values 
X1,%2,-°** , £, With corresponding probabilities 7,,---,7:,°-:, m:. Under 
these circumstances E,[ |] itself becomes the random variable E;[X(p1), ke(pe)], 
taking on values E,[x,(pi), ke(p2)], (¢ = 1, 2, --- , 2), with corresponding prob- 
abilities z;. The mean value of this new random variable can be formally 
written down as 


(4.8) E(E,|X (p1), ke(pe )}) = > 7; E,{2; (p1), ko(pe I. 


This expression can always be calculated from the probabilities 7; and (3.8) 
or from the curves given in section 6. However, in the applications which will 
arise later in this paper, this computation would be very time consuming. In- 
stead, an approximation to (4.8) will now be derived which will prove to yield 
very good results, and which can be obtained by a simple reading on the above 
mentioned curves. 

If X is regarded as a continuous variable, then E,[X(p1), ke(pe)| is a con- 
tinuous function of X, and, in fact, can be represented by a single curve similar 
to those appearing in section 6. Moreover, as is apparent from (3.8), repeated 
differentiation of £i[X(p1), ke(p2)] yields continuous derivatives. Consequently, 

t 


E,{X (pi), k2(p2)| can be expanded in Taylor series about a, wherea = he Wii. 


This procedure gives 


(4.9) E(EX(p.), ka(p)]) = = Day? a = 4) pita(p,), kelp, 


where Ei{a(p:), k2(p2)] represents the jth ae of E,[X(p.), ke(pe)] with 
respect to X evaluated at a. Interchanging the order of summation one ob- 
tains 


(4.10) >> Eilat), kel) 5 9 n, — a)? 


j=0 7! fal 








Kv 


~~ — rs 8 


BOX PROBLEMS 






The final result then becomes 


oO Ei 5 
(4.11) EEX (py), ka(p))) = Zp AMP: MP, 
where yp; is the jth moment of X about its mean, a. Thus toa first approxima- 


tion 
(4.12) E(E,[X (px), ke(p2)]) ~ E;la(p:), ke(p2)]. 


It is of interest to note that if E,[X(p:), ke(pe)] is linear in X then (4.12) is an 
exact expression since all derivatives except the first are zero. Furthermore, 
if E,[X(p1), ke(pe)] is of the second degree in X, then only the second non-zero 
term on the right hand side of (4.11) needs to be added to (4.12) in order to 
make it exact. The former of these is the relation which gave an exact solution 
in 4.2. 

It is important to realize that this analysis for E(E:[X(p:), ke(pe)]) can be 
immediately applied to E(E.[X(pi), ke(pe)]). For, by the use of (3.14) and 
(4.8), one obtains 





















(4.13) B(EAX(p), kao) = > + : — E(ELX(p:), ka(p2))). 


The same analysis can be applied to F;,;[ ] and the general result obtained 
that 


(4.14) 









E(FralX (pi), ke(pe)]) ~ Faala(pi), ke(p2)]. 


This immediately allows one to approximate the variance in the obvious manner. 
It is of interest to consider briefly the situation when both k; and ke are re- 
placed by random variables. Let k, be replaced by X, taking on values 2 , 








tye,°** , X12 With probabilities mu, m2,-+--, mie and ke be replaced by Xe 
taking on values 22 , %22, °°: , %2. With probabilities m2: , m2, °-+ , m2. Then 
(4.15) E(E,[Xi (pi), X2(pe)]) = i m1; 2; Ey(21;(p1), t2:(po)], 
2 
where 7 = 1, 2,---,¢ andj = 1, 2,---, s. Again applying Taylor series 
and expanding about a = >> miti; and b = >> m2; , the result i8 obtained 
i j 
that 


(4.16) E(EAX,(p:), X2(p)) = EE e@d, DD) 


uU,ve0 uw! v! 


Vv» 












where E;"[a(p:), b(p2)] is the uth partial derivative with respect to X, and 
the vth partial derivative with respect to X2 of Ei[Xi(p:), X2(pe)] evaluated 
at X, = a, X. = b. This gives the approximate formula 


(4.17) E(E,[Xi(p1), X2(p2)]) ~ Exla(p1), 6(pe)). 


360 PHILIP J. McCARTHY 


4.4. Mean and variance of the number of trials required to obtain either (at 
least) k; balls in the first box, or (at least) ke balls in the second box, --- , or (and 
at least) k, balls in the nth box. In accordance with previous notation the mean 
number of trials required is given by E,[ki(pi), ke(pe), --+ , Kn(pn)]. The exact 
value of this quantity can be written down and it would be a complicated multi- 
nomial expression. The evaluation of such an expression would be extremely 
difficult, if not impossible, especially for large values of ki, ke,--+, kn. In 
order to obtain an approximation to E,| |, repeated applications of (4.12) can 
be made and the resulting expression can be evaluated by means of the curves 
in section 6. 

For convenience, consider F,[ki(pi), ke(pe), ks(ps), ka(ps)]. The general 
result will then be apparent. Assume that the first three boxes form a single 
unit with probability (p: + ge + ps3). Then the number of balls required to 
obtain either k; in the first, k2 in the second or ks in the third, if all balls are going 
in these three boxes, is a random variable X. Consequently, 


(4.18) E,{ki(pi), +++ , ka(ps)] = E(Es[X (pi + pe + ps), ka(pa)]). 
Applying (4.12), 
E,{ki(p.), ie ka(ps)] ~& 


yl ae: ' Pi . Po ' P3 
(4.19) By E | hs (=, + po + =.) » hs ie +h + =) Re (=, +h + > 


(pi + pe + Ps), kod |. 
Applying (4.12) once again the final approximation is 


E,{ki(pi), eg ka(pa)] ~ 


pi Pe Pi + Pe 
' 1| £i| E he ee 2 ee a 
(4.26) Ey | : | , | hs = + =] ' ic + a iz +m + =) 


Expression (4.20) can be translated into a course of procedure. One considers 
the first two boxes and computes 


Pi Pe2 
= E,| k(—?— ), m (— 
™ | iE 


It is then assumed that a; is a new number associated with a box with probability 
(pi + pe) and 


= Bla (-e ee) (sea) 
ON on pe + pa)? \p + pr + Ds) |" 





BOX PROBLEMS 361 


Repeating this procedure again, one computes a3 = E,[a2(pi + po + ps), ka(pa)], 
and by (4.20) this is approximately equal to E,[ki(pi), --- , ks(ps)]. This method 
of computation is seen to be completely general and one can apply it to any num- 
ber of boxes. Each step consists of computing E,[ ] for two boxes and con- 
sequently can be carried out with the curves of section 6. It is evident that 
the order in which the boxes are taken may have an important effect on the size 
of the error involved in using this step-by-step procedure. This problem will 
be considered in section 5. 

It is of interest to note that one can also obtain another approximation for 
E,|ki(pi), ke(p2), Kks(ps), ka(ps)]. Suppose that the first two boxes are con- 
sidered as one unit and the second two boxes as another unit. Then the num- 
ber of balls which must fall in the first two boxes in order to obtain either k; in 
the first box or kz in the second is a random variable X,. Similarly a random 
variable X2 can be associated with the last two boxes. Accordingly 


(4.21) Ey[ki(pi), +++ , ka(pa)] = E(Es[Xi(pi + po), X2(ps + pa))). 
By use of (4.17), (4.21) can be written as 


E,lk:(p.), «++ , kaa] ~#,| B, E bs 7 he ks (= 5.) (pr + po), 


’ Ps pM ae 
E, | & ‘ = » hs (-. oe vi (ps + ps | ; 


This same analysis applies directly to the factorial moments. In particular 


Foalki(py), +++ 5 ka(pa)| ~ Foy 


oP as Pi 7 oo _ Pit Pe 
(4.23) | - | Bs ma & + >.) = & + | € + p+ a) 


7 oe “a 
” i + po + ) | (pi + ps + pa), ka(pa)). 


From (4.20) and (4.23) an approximate value for oj[ki(p:), keo(p2), ks(ps), ka(pa)] 
can be obtained. This procedure is also perfectly general and so an estimate 
of oi[ ] can be obtained for any number of boxes. 

This same method can be immediately applied to the approximation of 
E,[ki(pi), -- + , kn(pn)]. One simply considers the boxes two at a time, comput- 
ing E,[ ] at each stage instead of E,[ J. 

4.5. Solution for E,{k"| and E,[ki™’, kx]. When s is different from 1 or n, 
the complexities of the problem force one into the consideration of only the 
quantities given in the title of this subsection. The corresponding problem 
for three boxes, namely E.[ki(p1), ke(pe), ks(ps)], has been treated for general 
k; and p; by McCarthy [5]. However, the resulting expression is so complicated 
that it will not be given here. 

The process to be used consists of reducing the subscript s by a series of steps 


(4.22) 





362 PHILIP J. MCCARTHY 


until the subscript 2 is reached. This expression can then be evaluated by the 
use of the curves or by simple computation. For the sake of convenience, the 
case E;[k*] will be considered in detail. It will then be possible to write down the 
expression for general s and n. 

As a starting point, look upon the first three boxes as a single unit. Then 
there is a definite probability 7; that one of these boxes will have k balls in it for 
the first time on the 2;th throw into these three boxes and that the other two 
boxes of the unit will each have less than k& balls. Then if one of the other of 
the three boxes has u balls (u < k) the third box will have (x; — k — u) balls, 
(t; —-k—u<k). Meanwhile the fourth box will also have been receiving balls, 
and the number in it at this time will be denoted by j, (j = 0, 1, 2,---, @), 
For each x; there is a probability associated with wu, namely P(u | x;), and another 
probability associated with j7, P(j|x:). For the moment, consider that box 
1 has received k balls, box 2 the (x; — k — u) balls, box 3 the u balls and box 4 
the j balls. This numbering is of course immaterial since the situation is sym- 
metric with respect to the first three boxes. 

Now if j > k, either (2k + w — 2;) balls will be required in the second box or 
(k — u) balls in the third box in order to obtain three properly occupied boxes. 
On the other hand, if 7 < k, the specified number will be required in any two of 
boxes two, three and four. Consequently, with this conditional description of 
the situation, the required number of balls necessary to obtain three out of the 
four boxes occupied in the proper manner is 


(4.24) ti + i+ E,[(2k + wu — x), (k — wu), (k — 9), 


where (i — j) will be taken as zero if j is greater than or equal tok. From this 
description, it is evident that the desired mean value may be obtained by sum- 
ming (4.24) over all possible values of x; ,j and wu. Therefore 


Esk] = Qo 7: in + 2. P(j| x) 
(4.25) ~ 
. (j + > P(u|2)E[(2k + u — 2), (k-—u),k - im)}. 


It is to be noticed that the probabilities inside the E.[ ] in (4.24) and (4.25) 
do not add to one but only to 3/4. This can be easily remedied by the applica- 
tion of a formula similar to (4.6) and the result is obtained that 


E;{k'] = de Ti {r + : P(j | x) 


(4.26) 
(J + 4/3 >> P(u|x)E[(2k + u — 2), (k — u), (k - any}, 


where each probability inside E.[ ] is now 1/3. 





BOX PROBLEMS 


By simple considerations 


(2s - k)! u 2i—k—u 
(427) Piula) =< wee = i OO 
| (x; — k)! atin ? 
v ula; —k — u)! (3) (3) 


where u and (x; — k — u) are both less than k, and 


, @+j-—1)! 
(4.28) P(j| x) = ‘Ge Dit 


(3)**(a)’. 
From (4.27) and (4.28) 

(4.29) X jPG |x) = 2/3, 

and 


(4.30) > uP(u |x) = = —& 


(4.25) can be written as 

Ek] = i Ti Xi + a Wi D IPG | x4) 
(4.31) 4 
+- 3 i Wi X PQj | X;) aX P(u | «;)E2{ (2k +> xi), (k a u), (k — J). 


Finally, making use of (4.29), (4.30), the definition of x; and 7; and the procedure 
of replacing random variables inside an E,[ | by their mean values, 


6 4S wl (on EHR) (op BWR) (,, _ Bilt 
(4.32) Bali 3 ee + Bal (1 Bab), (te BED ( =i. 


and this in turn can be written as 


4 > 174 aa 7 17,3 7 3]. — ilk] ’ , _ Elk’ 
(4.33) E;|k'] = 3 {lh | + E | (2 zi?) ’ ¢ aN) I. 


This method of analysis which has just been applied to E;{k*] can be used 
equally well for E,[k"]. Here one simply considers the first (x — 1) boxes and 
proceeds as above. The final result is immediately apparent, namely that 


E,{k"| ~ — (este + 


n 
(4.34) 


n—-1,_ Elk’ |\"? (, _ Elk") 
Beal (S=4*- Fa) -(e- Fa) } 


It will be noticed that in reducing (4.34) further it will be necessary to consider 
expressions of the form E,|k?~", k2]. However, it will be seen from the foregoing 





364 PHILIP J. McCARTHY 


analysis that no use was made of the fact that the integers attached to the first 
(n — 1) boxes were the same. Accordingly, 


BARI, ba) me "5 (AS + 


(4.35) 


n—1, — (: _ Eki) 
B.-| (” — 2 ky a? ) ke . 1 . 


Now, by the use of (4.34) and (4.35), it is possible to reduce s as much as may be 
desired. 


5. Some considerations concerning the error of the approximations. 

5.1. Preliminary remarks. This discussion of the errors of the approximations 
given in the preceding sections has been left until now so that a broad perspec- 
tive might be gained, and the errors seen in relationship to one another. Such 
an arrangement is advantageous in this instance since both the analytical and 
computational results bearing on the subject are scanty, and consequently, 
any intelligent leads which their inter-relationships can give are most helpful. 

The difficulty involved in obtaining exact values for the various quantities 
considered in this paper has been pointed out quite frequently, and the approxi- 
mations have been devised to overcome this very difficulty. The same com- 
plexity which prevents the computation of many exact values also prevents any 
effective analytic approach to the problem of evaluating the errors. For these 
reasons the author has been unable to carry through any general analytic treat- 
ment of the errors of the approximations. However, because the intelligent use 
of approximations requires some knowledge of their accuracy, certain isolated 
cases have been investigated by a combination of computational, graphical and 
analytic methods. These investigations are detailed in the remainder of this 
section, and conjectures concerning the general behavior of the errors are made 
whenever possible. As has been stated earlier, no consideration will be given 
to the approximation formulae for the variance. 

5.2. Errors of the approximations for E,{ki(pi), ---+ , kn(pn)] and 


E,[ki(pi), ro tae kn(pn)]. 
Taking n equal to 3, we have from (4.11) that 


| Eilki (pi), k2(pe), ks(ps)] — Exla(p: + pe), ks(ps)) | 


(6.1 4iciat—2 ) te ( Pa | 
.1) <tei[ a (—B ; Pi + Po 


- Max | Ei{X(p, + pe), ks(ps)] | , 


where Max | Ei[X(p: + po), ks(ps)]| is the maximum absolute value of the 
second derivative of E,[X(pi + pz), ks(ps)] with respect to X, and a is equal to 
E,\ki(p:/(pi + pz)), ke(pe/(pi + pe))]. Now an examination of the curves 





BOX PROBLEMS 365 


given in section 6 indicates that, for fixed ps3 and kz , the maximum curvature of 
E,[X(pi + pz), ks(ps)], considered as a function of X, is a monotone decreasing 
function of ks. Since this curvature is negative, this geometric observation is 
equivalent to 


Max | Ei[X(p. + pz), (ks + 1)(ps)] | 
(5.2) 


< Max | Ei[X(pi +- pr), ks(ps)] |, 


although it is not necessarily true that 


| Eilei(p: + pe), (ks + 1)(ps)] | < | Eiles(p: + pe), ka(ps)] | « 
Moreover, 
(5.3) E,{ki(p:), ke(pe), ks(ps)] < Eilki(pi), ke(pe), (hs + 1)(ps)). 
From (5.1), (5.2) and (5.3) one readily obtains that the absolute value of the 
percentage error of the approximation to E,[ki(p:), ke(p2), ks(ps)] is bounded by 
a function, say U,[ki(p1), ke(p2), k3(ps)], which is a monotone decreasing function 
of kz as ks increases. It should be noticed that the results of 4.2 have already 
shown not only that this upper bound for the percentage error approaches zero 
as kz becomes infinite, but also that the absolute difference between the true and 
approximate values approach zero as ks becomes infinite. 

Computation of Ui[ki(p1), ke(pe), ks(ps)] is very time consuming because of the 
difficulty in obtaining Max | E}[X(p: + p»), ks(ps)] |, and because the direct 
computation of E,[ki(p1), ke(p2), ks(p3)] is laborious when any of k; , ke and kg 
are much larger than 2 or 3. In order to surmount these difficulties and still 
give some indication of the behavior of Ui[ki(p:), ke(p2), ks(ps)], the following 
expedients were adopted: 

1. The values of k; , k. and ks were each fixed at 5, 

2. Max | Ei[X(p.: + 2), ks(ps)]| was obtained by graphical means, namely 
drawing the slopes of the appropriate curve in section 6, graphing these slopes 
and then taking off the maximum slopes of these curves. 

3. E,{ki(gu), ke(pe), ks(ps3)] was replaced by its approximation, 


E,{a(p, + P2); ks(ps)], 


in the computation of the percentage error. This new bound will be denoted 
by Ur [ki(q), ko(peo), ke(ps)]. 

4. Carefully chosen values of U7 [k:(p:), ke(p2), ka(ps)] were plotted on trian- 
gular coordinates, and contour lines interpolated and extrapolated to cover in 
large part the range of p; , p2 and 7p; . 

The use of the third of the above listed assumptions is no detriment to the 
usefulness of the results since 


Eyl) — E,f{] : 
E,,.{] — E,{] Ex] Ui [ki(pr), ke(pe), ks(ps)] 





E{] 1 —_ Eral] — ExlJ| ~ 100 — Uzlha(ps), hap»), ka(ps)]’ 
- 








366 PHILIP J. MCCARTHY 

























where E,,[ ] = E,la(pi + pz), ks(ps)] and Ef ] = Ey,lki(pi), ke(pe), ks(ps)). 
Since UT[ |] is a monotone decrease function of ks , this new bound on the per- 
centage error is also monotone decreasing for increasing k3;. Absolute values 
were not required in this derivation since E,,[ ] is always greater than or equal 
to E;[ ], as is apparent from (5.1) and an examination of the curves of section 
6. The contours of Ur [5(p:), 5(p2), 5(ps)] are shown in Fig. 1. The interpreta- 
tion of this figure is very straightforward. For example, for p; < .5, the value 










TXT \I\ X87 \X\ 
ef WAAKV VA, 
eV VW V VAX 
[\/\BEEXPOPI\/\/N 


ry 












Py 
7 Sh 5.0% 2.0% 


; Cre . 
Fic. 1. Contours or U;[5(pi), 5(p2), 5(ps)] ConstpERED AS A FUNCTION 
OF P1, P2 AND P3 





of UT [5(p:), 5(p2), 5(ps)| is less. than 5.0%. Making use of the definition of 
Ui[ ], and especially its monotone characteristic, one can then say: the ap- 
proximation for E,[5(pi), 5(pe), ks(p3)], where ks > 5, p3 < .50 is in error by not 
more than 5.3%. Moreover, as has been already observed F,[a(p, + py»), 
k3(ps)| is always greater than or equal to E,[ki(p1), ke(p2), ks(ps)]. 

It will be noticed from Fig. 1 that U?[ ] is increasing steadily as ps approaches 
1. It has been demonstrated by McCarthy [5] that this behavior of the upper 
bound does not mean that the percentage error itself becomes larger as p3 ap- 






mes OD OO Oo OD 


BOX PROBLEMS 367 


proaches 1. As a matter of fact, for fixed hk, , ke and ks, the percentage error 
approaches zero as p3 approaches 1. However, this demonstration does not 
furnish any reasonable bounds with which to fill in the lower left hand corner of 
Fig. 1. This fact is not as serious as it may at first seem because there is nothing 
to prevent one from reordering the boxes. For example, consider £,{5(.2), 
5(.2), 5(.6)]. From Fig. 1, the error of the approximation for this quantity, 
namely E,[E#,(5(.5), 5(.5)](.4), 5(.6)], is not more than approximately 


7.5/(100 — 7.5) = 8.1%. 


On the other hand this same figure shows that E,[F,[5(.25), 5(.75)](.80), 5(.20)], 
which is also an approximation to F,[5(.2), 5(.2), 5(.6)], is in error by not more 
than approximately .8%. Consequently one would choose the second ordering. 

The procedure which has been used to obtain an upper bound on the percent- 
age error of the approximation to E,[ki(p:), ke(pe), ks(ps)), ki and ke, fixed and 
ks greater than or equal to that integer at which the bound is evaluated, can also 
be applied to Es[ki(p1), ke(pe), ks(ps)]. All the assumptions remain the same 
and in this case the bounds corresponding to U,[ ] and U?[ ] are denoted by 
U;{ ]and U3[ ]. Asin the case of U,[ ], we have 


E;(| — E»[] 
E;(] — Ex(] a Ex] < Ustki(p.), ko(p2), ks(ps)| 
E;(] 1 4. Bal) — Ew ll 100 
Ex [| 


Here the approximation, E2[b(pi + pe), ks(ps)], is always less than or equal to 
the exact value, Es{k:(p:), ko(pe), ks(ps)]. The contours of U3[5(p:), 5(pe), 5(ps)] 
are shown in Fig. 2. In using U315(p:), 5(pe), 5(p3)] it is sometimes advan- 
tageous to reorder the boxes. For example, consider E;[5(.2), 5(.2), 5(.6)]. 
Fig. 2 shows that, as an approximation, F.[F.[5(.5), 5(.5)](.4), 5(.6)] is in error 
by not more than approximately 9%. However, E,|E,[5(.25), 5(.75)](.80), 
5(.20)], which is also an approximation for E3[5(.2), 5(.2), 5(.6)], is in error by 
not more than about 7%. There is a gain here, but it is not as great as the cor- 
responding situation for #,[5(.2), 5(.2), 5(.6)]. 

As has already been stated, one may minimize the error by correctly choosing 
the two boxes which are to be combined first. Some discussion will be given 
here of a procedure for choosing these two boxes. Of course an experimental 
scheme may be used which makes use of the fact that the approximation to 
E,{ki(p1), ke(pe), ks(ps)] is always an overestimate. In other words, that grouping 
is used which gives rise to the smallest value of the approximation. However, 
this can be replaced by a few preliminary computations. 

As can be seen from (5.1), the error of the approximation depends upon two 
quantities, namely the variance of the two box situation obtained by combining 
two of the boxes, and the maximum value of the second derivative of the curve 
representing the function E,[X(pi + je), ks(ps)] over the proper range of X 
values. The error will be zero of E,[X(p: + pe), ks(ps)] is either a constant or 





368 PHILIP J. MCCARTHY 


linear in X over the range of X values in which one is interested, that is k, < 
X <kh+hk,—1,k1 < ke. If this is not possible, then one wishes to make it 
as near so as possible, subject to the restriction that 


oi[ki(pi/ (pr + pz)), ko(po/(pi + pr2))] 


is not unnecessarily large. 


[\/ MINES 
ILLS 


/\ 1X LSo-L 


ITC EF VT EX 
ITN AAA 


P) 


15.0% 10,0% 


Fic. 2. Contours or UF(5(n1), 5(p2), 5(p3)] CONSIDERED AS A FUNCTION 
OF P1, P2 AND P3 


An indication of the relationship between the boxes for both linearity and con- 
tribution to variance can be obtained from expressions (3.10) and (3.11). Thus 
for each box one computes k;/p; and k;(1 — p;)/p;. Then in order to most nearly 
achieve linearity one orders the boxes in accordance with the increasing order of 
k;/p; and combines them in that order. If there is a tie between two or more 
boxes with respect to the k;/p; ordering, then one orders these ‘‘tied’’ boxes in 
accordance with increasing k;(1 — p,)/pi. 

Some computations have been carried out to illustrate these points and they 
are given in Table 1. The notation ((2, 4), 6) means that one first combines the 
boxes with integers 2 and 4, and then combines this result with the box with 





BOX PROBLEMS 369 


associated integer 6. All values in this table were obtained by direct computa- 
tion. No use of the curves was made. 

In these three situations, one obtains the values given in Table 2. 

Thus in the first case there is nothing to choose with respect to k:/p;, but 
k(1 — pi)/pi indicates the ordering ((6, 4), 2). Actually the percentage error 
in this instance is 1.0 as compared with 1.7 and 2.4 for the other two orderings. 
In case two, k;/p; indicates the ordering ((2, 6), 4). Although this does not 
turn out to be the best ordering, Table 1 shows that the ordering in this instance 
makes little difference. In the last case, the indicated ordering is ((2, 4), 6) 
and the percentage error for this is zero, as opposed to 1.3 and 1.6. Since at 
any stage in the operation of combining boxes two at a time (4.13) holds, the 


TABLE 1 
Effect of Order of Combination on Error of Approximation 
% Error of Approximation 


Order of Combination 
((2, 4),6) ((2,6),4)  ((4, 6), 2) 


+1.7 +2.4 +1.0 
+0.3 +0.5 +0.5 
+0.0 +1.3 +1.6 


p 
4 Elk: (pi), ka(ps), ka(p2)] 
3 


TABLE 2 





» | 1/6 1/3 1/2 | 1/6 1/3 1/2 | 1/6 1/3 1/2 
k; 2 4 6 s 6 3 6 4 
k;/ Di 12 12 12 24 18 4 36 «12 
k(1 — pi/pi | 60 24 12 120 36 4 | 180 24 





above procedure will also give the minimum error for the approximation to 
Es{ki(pi), ke(pe), ks(ps)]. Moreover, the approximation for this quantity is always 
an underestimate of the true value, and therefore that ordering should be taken which 
gives the greatest value for the approximation. 

When the error of the approximation to E,[ki(pi), --- , kn(p,n)] and 


E,.lki(pi), ae Kn(pn)], 


for n greater than three, is considered, it is immediately obvious that the general 
considerations already given in this section still apply. In addition to these 
considerations, there is the difficulty that errors may cumulate. However, the 
results already quoted for three boxes, in conjunction with those which are to 
be given in 5.3, indicate that this cumulation is not serious. There are two 
factors which eventually prevent (i.e. as more and more boxes are considered) 
this percentage error from becoming unduly large, and, in fact, make it approach 
zero. These are: 


. . ° . . 
1. The value of p; will, in most instances, be decreasing as more and more 
boxes are considered (see Fig. 1), and 











370 PHILIP J. MCCARTHY 


2. The true value is usually becoming larger and larger as more and more boxes 
are considered. 

In order to minimize the error, the following precautions should be taken: 

1. At each stage in the computation, try to avoid, as much as possible, making 
readings where E,[X(p; + 2), ks(p3)] is curving sharply. If all readings are 
made where the curves are nearly linear, the percentage error will be very close 
to zero. On the other hand, if many readings must be made where the slopes 
of the curves are changing most sharply, larger errors must be expected. 

2. Use that ordering of the boxes which provides the minimum value for the 
approximation to £,{ | or the maximum value for the approximation to £,[  ]. 

3. In order to approximate the ordering which (2) would give, compute 
ki/pi and k(1 — p,)/p;; at each stage at which two boxes are to be combined 
and use the rules of procedure already given for three boxes. 

5.3. Error of the approximation for E.{k"|. Repeated applications of the re- 
duction formulae (4.34) and (4.35) allow one to evaluate E,[k"] by means of the 
solution for the two box case, or more explicitly, by means of the curves given in 
section 6. Here the error of this approximation will be discussed primarily from 
a computational point of view. 

E,{1"] can be treated in detail since it is possible to obtain exact values for this 
expression by means of (4.1). This has been done by McCarthy [5], but the 
details will not be repeated here because of lack of space. The results simply 
add more credence to the conjectures which will soon be made. ‘ 

When k is taken to be larger than one, the difficulty arises that it is almost im- 
possible to compute the exact value of £,[k"] in a large number of cases. Con- 
sequently it was necessary to devise an experimental model to estimate these 
exact values so that the amount of error would be known within bounds. A 
set of 10,000 punched cards’ was obtained on which were recorded 100,000 
random numbers drawn from a rectangular distribution. Thus if the cards are 
ordered on a particular set of columns, and one reads off the digits 0-9 on another 
specified column, one card at a time, it is equivalent to using a table of random 
numbers such as those prepared by Tippett [7]. By the use of these cards, it 
was possible to run off on an IBM Tabulator any desired number of experiments 
in order to obtain an experimental distribution from which to calculate an es- 
timate of E.[k"] and the variance of this estimate. For example, in determining 
an estimate of E,[2°] one hundred experimental trials were made, as described 
above, with the following results: 


Number of Trials 


Required Frequency 
2 23 
3 32 
4 31 
5 11 
6 3 


1 These punched cards were prepared at the Mayo Clinic, Rochester, Minn., under the 
direction of Doctor Joseph Berkson. 


From 

from | 
mean, 
estimé 
out tl 


nec 
vali 
exti 

A 
n is 
cal 
Th 
t-d 


BOX PROBLEMS 371 


From this distribution the estimate of E,[2”] is 3.39, with a variance computed 
from the distribution of .011. The 95% symmetric confidence limits for the 
mean, computed from the Student ¢-distribution, are 3.17 and 3.61. Such 
estimates will be used in the remainder of this section. It should be pointed 
out that in order to prevent a prohibitive amount of machine time, it was 


TABLE 3 
Percentage Errors for E,{k"] 











8 k n 3 4 5 
1 1 _ _ _ 
2 + .7 + 2.2 — 3 +13.6 
5 + 1.1 — 3.1 +5.7 + .6 +10.7 
10 —2.9 + 5.1 
2 1 — 5.6 — .4 + 1.3 
2 — 4.6 — 4.4 +4.4 + .6 +10.4 
5 —4.6 +1.7 +3.0 +9.3 +7.9 +14.8 
10 —3.7 +2.1 — .8 +65.5 + 4.3 +10.7 
15 +1.0 +7.2 
20 —2.5 +2.4 
3 1 —18.2 —12.7 — 3.1 
2 — 6.3 —16.5 —7.3 — 2.9 + 6.0 
5 —9.7 —2.2 —-10.7 —5.5 + 8 + 5.8 
10 — 2.1 + 3.1 
4 1 —12.0 —15.6 
2 —13.6 +6.1 —-11.6 — 3.9 
5 —-13.9 -—7.2 —- 9.9 — 4.0 
10 — 8.9 -—2.6 —- 6.4 — 1.2 
5 1 —8.8 
2 —18.1 — 6.0 
5 —-12.5 — 5.6 
10 | — 8.9 — 2.9 











necessary to use many of the same runs to determine values of E,[{k"] for different 
values of s, k and n. This means that the errors are correlated to some slight 
extent, but it would be extremely difficult to determine how much. 

A summary of the computed percentage errors for various values of s, k and 
nis given in Table 3. In the instances where there are two entries, they are 
calculated on the basis of the 95% confidence limits for the experimental mean. 
These confidence limits are symmetric and were determined by using the Student 
t-distribution. For & equal to 2 and 5 the distribution contained 100 trials, 








372 PHILIP J. McCARTHY 





while for k greater than 5, the distribution were made up of approximately 50 
trials. 

The computations given in this table show for various values of s, k and n, 
the percentage error of the approximation for E,{k"]. In addition to showing 
the values of these percentage errors, the computations lead one to conjecture 
that 

1. For fixed s and k, there exists an np such that for n > m the absolute value 
of the percentage error of the approximation for E,[k"] is a monotone decreasing 
function for increasing n. It was shown by McCarthy [5] that this absolute 
value approaches zero as n approaches infinity for E,[1"], and in fact, that the 
difference between the true and approximate values approaches zero. 

2. For fixed s and n, there exists a ky such that for k > ko , the absolute value 
of the percentage error of the approximation for E,[{k"] is a monotone decreasing 
function for increasing k. 

6. Computation. 

6.1. Curves to aid in the computation of E,\ki(pi), k2(p2)|. In 3.1 it was shown 
that E,[ki(p1), k2(p2)] is equal to 

* Ty(bs + 1, be) + 2 Ing(ke + 1, i), 

Pi P2 
where J,(p, q) is the Incomplete Beta-Function as tabled by Karl Pearson [6]. 
There are three principal difficulties connected with the use of these tables as 
they apply to the approximations of this paper. These are: 

1. The tables must be available, 

2. The tables give directly only values for integer or half-integer values of 
k, and k,, and 

3. Since many different values of Ei[ki(pi), ke(p2)] are often required to obtain 

a single approximation, the computational burden would be very heavy. 
In order to surmount these difficulties, it seemed advisable to prepare curves 
giving the values of E,[ki(p:), ke(p2)] for various values of ki, ke, pi and pr. 
These curves would give values of £,| | with sufficient accuracy for most prob- 
lems not only for integer values of k; and k, , but for all values over the range 
considered. 

Such curves have been prepared by computing E,[ki(p:), ke(pe)| for integral 
values of k; and ke (for fixed p; and p,) and then joining these points with a 
smooth curve. A summary of the graphs prepared is as follows: 


ky ke Pi , P2 
Fig. 3 1,2, --- , 25, © 1,2, --- , 35 50 50 
Fig. 4 1, 2, --- , 20, © 1, 2,--- , 35 40 .60 
Fig. 5 1,2, --- , 15, @ 1, 2,--- , 35 .20 .80 
Fig. 6 1,2,---,10, « 1,2, ---,15 80 .20 
Fig. 7 1,2,---, 7, @ 1,2,--- ,15 .60 40 
Fig. 8 1,2,-°-, 8, @ 1,2,---,15 50 00 
Fig. 9 1,2,---, 6, @ 1, 2,---,15 40 .60 
Fig. 10 1,2,---, 5, @ 1,2,---,15 .20 80 








BOX PROBLEMS 


Figure 3 


E[KC50, K,( sa 






























































Figures 8, 9, and 10 are simply portions of figures 3, 4 and 5 drawn on an ex- 
panded scale in order to permit greater accuracy in reading the curves. Also 
figures 6 and 10 and figures 7 and 9 form pairs in that a member of one pair can 





PHILIP J. MCCARTHY 


Figure 4 
e[k,400, ¥ 20 


A788 
SEE LAT 
dl clashed adhd alfend lh adhd bho bend veel dae dad) 
PE TTT TT tt AY 
ptt tb 


1G A ro 
CCC Ae 
i hee A 


PSCC eo 
Yn 
PEC LY NA 
TTT TTT TTT wy 
PERCE 
PTT TTT TTT Tr weer 
TTT Tritt ECCCCCL Ae 
PERE Ue cae 
—T | 
Pec ETiaiit tf) Cee are 
PERE eee 4 r 
A | 


SS 


ACA 





if 

































































0 


be obtained from the other member of the pair. Both members of the pair are 
given on the expanded scale in order to facilitate interpolation. Values of the 
mean for combinations of k; and k2 not given directly can usually be obtained 





BOX PROBLEMS 


Figure 5 


ee 





L | | | | Mee 
PEEeErre 
CEE +4 SSRRE 
| LT tT tt ye 

wkd ECCEC eet 
Peet eT et ee eT i 


PTT Te 
LT TTT tt ttt | gy 
LT Tt ttt tyr 


EEE ere 
V/A 
| | tI a Lad 








i 
: 














i 
tee 


| 
Lt [| i 




















\_| 
| 
| 

































































with sufficient accuracy with linear interpolation. sitll for ~, and pr 


should be done graphically since in some instances linear interpolation would be 
extremely poor. 

















376 PHILIP J. McCARTHY 


As an example, suppose one has two boxes with k; = 2, ke = 5, py: = .40 and 
p, = .60. Consulting Fig. 9, one goes along the horizontal axis to ky = 5, 


FIGURE 6 


E,|K,(80, K,C20) 


















































Following up the vertical line through this point to the curve k, = 2, E,[2(.40), 
5(.60)] is read as 4.25. The actually computed value to four decimals is 4.2224. 





BOX PROBLEMS 


FIGURE 7 


K, (60), K (40) 











peel 

{ 
tt 
az 


ENE EEE 
Na 
HEMEN 
HEME 


an 
EEE 








EoRSES 
H H H Pt 
- - 


a 
ii 








: 





i 
| 


BA 
ee 


: i 


A 
a 


1 
H 
EW 


FE 

a: 

a 

NS 

gee 

tit 
HT 
: 














Ret 
CSS 
EEE 

a 

pea 
if 


4 

+ 

1 

Ly 
1 





ie 


He 
ee 








om 
SLE 
H H 
HEE 
CI 





: 
HH 





++ 
= 
+4 


1) 1) A ane CS EE BD el dl loka 1} 1s) 
oY He seSseseese tf =o 
5 al dk dated dala 1} ii} 
CHA eet Eee cy 
fe +H | EEE 
f alieall + torn nfs +++ Tt + et tH t Seehcetiedneteanadinlniindlien 
Y ot roe 


aS 
ESP StSS Sse 
L CECE HH 


It is immediately evident that E2[ki(pi), ke(pe)] can also be obtained from the 
curves since 
E{ki(p), ko(pe) | = (ki/px) + (ke/ pe) = E,{ki(pu), ko(p2)). 
6.2. Use of the curves to obtain exact values (i.e. subject only to the error of reading 
the curves) for E,[ki(pi), ke(pe), ks(ps)|. Referring back to (4.8), one obtains 
that 


(6.1) E,{ki(p.), ko(po), ks(ps)| = a ms E,\x;(p + Po), ks(ps)], 





all 








PHILIP J. MCCARTHY 


FIGURE 8 


E,|k,(50, K,(50) 


1] root SEnnn H+ i 
CLL LA 


a 
Ato ce eed tt 
Peete L AadMecMialedlieslaabadietatiadlacladReaieliadadii’ 


2 i4 


where 7; is the probability that either k; balls are obtained in the first box or kz 
balls are obtained in the second box on the 2; th throw for the first time, assuming 
balls can go only in boxes one and two. 2; takes on values 


ihe’ i > tht 


when ki < ke. Now 7m; can be easily computed and £,[x;(p: + pe), ks(ps)] 
can be obtained from the curves. The only difficulty in using this procedure 





BOX PROBLEMS 


Ppp 
eH 


po FIGURE 9 


™ * 


ae 
i 


Chad diial SS edsdade Red eadedubadkal 
jp H+ dh daddck Aededladdel Ba jaan AHH Pee 
hd A Medak dei eh ake dcdahale Ao js 
CELL Lee re 
La Nl lell allemaal allele ael al 
ti ptt A 


BE 
HH 


ets 


Gq 
A ee 

hE SR kal ih Me Rel Rd Mlk allah hadi 

ST eo 

pf fp a oe 

aEaeen Se eae 
Ce pH et jd 


a 


Pra Ree Rees 
TATA errr bE hdl a Ade Mak Mdh aaticledhad 
Lf CEL ea 
yo PALL 

PERLE ei 

EI ck la Acad all lak d al Radcliahlndadstcdaiehadal 

7 oat HH eee eRe 
x44 at Har 1 tt = pose fe 
Ie 





Cert tt 
err ttt 
See 
AT SE 
See 
FARCE 
a at aannn 
Po 
COL 
Be ere 
Pett tt 
Seer 
CSE 
HEE Coo 
at 
HHA BEE EEE EEE EEE EEE 


V7 Tr ae ] 
Perec coo eee 
Al Yi i tj }—- + 
tH Hf Pete te Pet 
YA a ty a = ———- gaeee - 


ppb EE eee yt rt 
Sree eet Con ci 














arises when the range of x; is large. Then a large amount of computation is 
involved. 


In order to illustrate this computation, consider £,[2(.1), 3(.1), 5(.8)]. Here 
z; takes on the values 2,3 and 4. We have x = 2, m = 2/8; x2 = 3, m2 = 3/8; 
and x3 = 4, 73; = 3/8. From Fig. 6 

E,[2(.2), 5(.8)] = 5.09 
E,{3(.2), 5(.8)] = 5.88 
E,{4(.2), 5(.8)] = 6.11. 





PHILIP J. MCCARTHY 


eee 22 es} 
EEE 
Ty 








FIGURE IO 


< Tt 


poe 


K (20), K (80) 





Eh 
a 


He 
| 5 


Hi 
tH 
fi 





EEN 


Leal AT 

et eed 

Pee 
SERS BBS 48788 


rth 
itt 
H 
cH 
REECE 
HY PEE Ht 


HH 
Hi 
cE 
Ha 


HEH 
coo 
Tr 
Peer 
Ha 
Ht} 

To 7 tT 
ei a 
ae al alae 

t 4 2m! 

4 
Pere oH 7 
rot 


\ 
ii a 
\ 


oh 





1 
it 

CH 
= 


Li 
Ree 
aane ana 
ane 
SRRED, 4B 
SSeeene Cai 
| LEE e  ee 
ee eee) 
Loe ee ee 
eee oe 
CE ed 


: 
EHH 


ane 





FEE 
ESTES 


PEELE 
: t 


He 
rH 
rH 
rm 
C1 
r 
oI 





Try 
Car 
CG 








if 
HH 








op 
pee 








Consequently, E,[2(.1), 3(.1), 5(.8)] is equal to 
(5.09)(2/8) + (5.88)(3/8) + (6.11)(3/8) = 5.77. 


Using computed values for F,[x;(.2), 5(.8)], #:[2(.1), 3(.1), 5(.8)] is equal to 
5.75. Thus the use of the curves has only led to an error of .3%. 





BOX PROBLEMS 


6.3. Use of the curves in approximating E,{ki(pi), --- , kn(pn)], 
E,[ki(pi), Pr kn(pn)| 


and E.{k"]. In illustrating the application of the curves and the reduction 
formulae (4.34) and (4.35), one example will be worked through in detail. This 
example will provide illustrations of all the details involved in such problems. 
Consider E,[5°]. Applying formula (4.34) 


(6.2) E,{5°| ~ 5/4 ls" + EB; | (#5 = — ) ’ (5 7 mY) |. 


Consequently, the first step must be to compute E,[5‘]. Using the principles of 
4.4 . 


(6.3) E,[5‘] ~ E,(E,[5")(.50), 5(.25), 5(.25)). 
From Fig. 8, £,[5°] = 7.55. Therefore E,[5‘] is approximately equal to 


E,{7.55(.50), 5(.25), 5(.25)]. 
Now applying the same principle again, 
(6.4) E,[5"] ~ E,{E,{7.55(3), 5(3)|(.75), 5(.25)]. 


By the use of figures 7, 8, 9 and 10, graphical interpolation may be applied to 
find that F,[7.55(2), 5(4)] is equal to 9.84. The approximation procedure now 
says that 


(6.5) E,[5‘] ~ E,[9.84(.75), 5(.25)]. 


Again applying the curves and using graphical interpolation for p, and gr, 
E,{5‘] ~ 11.88. 


Substituting this value in (6.2), 
(6.6) E,[5°] ~ & {11.88 + E,[2.71, 2.71, 2.71, 2.03]}. 
Now formula (4.35) must be applied to E;3[2.71, 2.71, 2.71, 2.03], i.e. 
E;{2.71, 2.71, 2.71, 2°3] = 
=4\31\2 3 
H{Ble7" + Ba[ (pari — BCTV (203 - BUG TN) J 
) 


E,{(2.71)*| can be evaluated by the same method used for E,[5‘]. This leads to 
the result 


(6.8) E;{2.71. 2.71, 2.71, 2.03] ~ } {4.40 + E,[1.86, 1.86, .56]}. 
Once more applying (4.35) 
E.{1.86, 1.86, .56] = 


1{sl(.86)" + E,[(2-1.86 — E,{(1.86)*)), (.56 — zie) }. 


(6.9) 


















382 PHILIP J. MCCARTHY 


E,{1.86, 1.86] is equal, by the curves, to 2.25. Therefore 
















(6.10) E,[1.86, 1.86, .56] ~ 3 {2.25 + £,[1.47, — .56)}. tic 
7 
However, since the convention is observed that a negative quantity is replaced W 
by zero, 
(6.11) F,[1.47, — .56] = £,[1.47, 0] = 0. ( 
Now working back through these various expressions, [: 
(6.12) E,{5"|] ~ § [11.88 + 4 [4.40 + 3 [2.25 + O]]] = 27.81. 
From Table 2 it can be seen that the percentage errors for this approximation 
to E,[5°], corresponding to the 95% confidence limits for this quantity, are —4.0% [ 
and —9.9%. 
This example has illustrated most of the situations which will arise in the use [ 
of the approximations of this paper. 
6.4. Miscellaneous approximation formulae useful for computation. There 
exists a relatively simple approximation to E,[ki(p:), k2(pe)|, p1 + pe = 1, when 
p2 is near one. Using (3.8) and making some obvious simplifications, one ob- [ 
tains s 
ke ] (ky + ke) ! 1 {1 


Pi 
E,lki(p,), ko(pe) | — Ds + ro as t)*?*(¢ — px) dt, 


po (ki — 1) (ke — 1)! pi 


Since p; is near zero, (1 — ¢) can be replaced by one, and the result is obtained 
that 


: : ~ ke ial 1 iy (ki + ke)! 
E,{ki(p1), k2(p2)| = pp ht De - D! 
An approximation to the Incomplete Beta-Function, given by Tukey and 


Scheffé [8], may also prove useful at times. The expression, changed slightly 
by those authors since publication, is 


hin-r+in~1- E - x ‘ e 1) gy? 
’ ’ ar(r) f, \2 = 








where 


, 
Vi + 2r. 

The right hand side of the first expression will be recognized as the x’ distribu- 
tion with 2r degrees of freedom. In the event that the tables of x’ are not ade- 
quate for the application of these expressions, the approximation of Wilson and 
Hilferty [10] should be used. This approximation states that (x’/v)* where 
v is the number of degrees of freedom, is approximately normally distributed 
with mean 1 — 2/(9v) and variance 2/(9v), for large v. 


BOX PROBLEMS 383 


7. Acknowledgements. The author wishes to express his grateful apprecia- 
tion for the many helpful comments and suggestions received from Professors 
W. G. Cochran, A. M. Mood, J. W. Tukey and §. 8. Wilks. 


REFERENCES 

[1] R. A. Fisher anv F. Yates, Statistical Tables for Biological, Agricultural and Medical 
Research, Oliver and Boyd, London, 1943, Table XXII. 

[2] M. A. Grrsuick, FREDERICK MostTeE._er, anv L. J. SavaGe, ‘‘Unbiased estimates for 
certain binomial sampling problems with applications’’, Annals of Math. Stat., 
Vol. 17 (1946), pp. 13-23. 

[3] J. B. S. Haupane, “On a method of estimating frequencies”’, Biometrika, Vol. 33 

(1945), pp. 222-225. 

[4] P. S. Lapuace, Théorie Analytique des Probabilités, Mme. V® Courcier, Paris, 1820, 

pp. 194-219. 

[5] P. J. McCartuy, Approximate Solutions for Means and Variances in a Certain Class of 

Box Problems, unpublished thesis, Library, Princeton University, 1946. 

[6] Kart Pearson, Tables of the Incomplete Beta-Function, The ‘‘Biometrika”’ Office, 

London, 1934. 

(7] L. H. C. Tipperr, Random Sampling Numbers, Cambridge Univ. Press, 1927. 

8] J. W. Tukey, ano H. Scuerr®, ‘‘A formula for sample sizes for population tolerance 
limits’’, Annals of Math. Stat., Vol. 15 (1944), p. 217. 

[9] J. V. Uspensky, Introduction to Mathematical Probability, McGraw-Hill, 1937, p. 181. 

[10] E. B. Witson anp M. M. Hitrerty, ‘‘The distribution of chi-square’’, Proc. Nat. 
Acad. Sci., Vol. 17 (1931), pp. 684-688. 














THE DISTRIBUTION OF THE RANGE’ 


By E. J. GuMBEL 
Brooklyn College, N. Y. 





1. Summary. The asymptotic distribution of the range w for a large sample 
taken from an initial unlimited distribution possessing all moments is obtained 
by the convolution of the asymptotic distribution of the two extremes. Let a 
and wu be the parameters of the distribution of the extremes for a symmetrical 
variate, and let R = a(w—2u) be the reduced range. Then its asymptotic 
probability ¥(R) and its asymptotic distribution y(R) may be expressed by the 
Hankel function of order one and zero. A table is given in the text. 

The asymptotic distribution g(w) of the range proper is obtained from y(R) 
by the usual linear transformation. The initial distribution and the sample 
size influence the position and the shape of the distribution of the range in the 
same way as they influence the distribution of the largest value. If we take the 
parameters from the calculated means and standard deviations, the asymptotic 
distribution of the range gives a good fit to the calculated distributions for normal 
samples from size 6 onward. Consequently the distribution of the range for 
normal samples of any size larger than 6 may be obtained from the asymptotic 
distribution of the reduced range. 

The asymptotic probabilities and the asymptotic distributions of the mth 
range and of the range for asymmetrical distributions are obtained by the same 
method and lead to integrals which may be evaluated by numerical methods. 


2. Introduction. For any initial distribution, and any sample size n, the dis- 
tribution of the range may easily be written down in the form of an integral. 
However, for many given initial distributions the integration can be carried out— 
if at all—only for very small sample sizes, say n = 2 orn = 3. For larger 
samples, complicated numerical calculations have to be made, and there is no 
way of obtaining the distribution for n + 1 observations from the distribution 
for n observations. 

Our object is to obtain the asymptotic distribution of the range. Nothing is 
supposed to be known about the initial distribution, except that it is of the ex- 
ponential type [9] which assures that it is unlimited in both directions, and pos- 
sesses all moments. It will be shown that this condition is sufficient for the 
existence of an asymptotic distribution of the range. 

With increasing samples sizes the distribution of the range may approach its 
asymptotic form in a quick, or in a slow way. This behavior depends upon the 
nature of the initial distribution. Two examples for this approach will be 
shown. 






1 Research done with the support of a grant from the Social Science Research Council. 
384 














whe 
be | 
ver’ 


or, 


af 
thi 


DISTRIBUTION OF RANGE 385 


3. The exact distribution of the range. Let ¢(x) be any initial distribu- 
tion, &(x) the probability of a value equal to, or less than, xz. Then, for samples 


of size n, the joint distribution w,(a , x,) of the smallest value 2; and the largest 
value Zp is 


(1) W(X ’ Ln) _ n(n _— 1) g(a) (®(x,,) = @(21))" “o(an). 
The distribution g,(w,) of the range w, defined by 
(2) Ln = X% + Wr 


is obtained by integrating over all values x; < x, whence 


(3)  gn(wn) = n(n — 1) [ . (B(x + wn) — (2))" * v(x + wx)e(x) da, 


where the index 1 has been dropped. The probability G,(w,) for the range to 
be equal to, or less than, w, is obtained by integration of (3), whence, by re- 
versing the order of integration, 


Gr(wn) = n [ : ir (n — 1)(®(z + wr) — &(x))"* d&(x + w,) d&(z), 


or, after integration, 


1 
oo i (@(z + w,) — &(z))"" de, 


a formula to which Prof. H. Hotelling has drawn my attention. The beauty of 
this formula is completely marred by the facts that, in general, we cannot express 
(x + w,) by ®(x), and that the numerical integration is lengthy and tiresome. 

The problem of the range for the normal distribution was first raised twenty 
five years ago by L. von Bortkiewicz [1, 2]. For n = 2 and n = 3 the distribu- 
tion of the normal range may be written down explicitly [12, 13]. For larger 
normal samples up to n = 20, E. 8. Pearson [16] and H. O. Hartley [10] have 
calculated numerical tables of the probability of the range. L. H. C. Tippett 
[20] has calculated the mean, the standard deviation, and the moment quotients 
for the range of the normal distribution up ton = 1000. He gave formulae for 
the moments in the form of integrals. Finally ‘Student’ [18] reproduced the 
distribution of the range for small samples, n = 2, 3, 4, 5, 6, 10, by Pearson’s 
type I, and gave a formula for large samples n = 20, 60, based on Pearson’s type 
VI, a procedure which is purely empirical and, therefore, unsatisfactory for 
theoretical purposes. A good resumé of the present knowledge about the 
range is given in Karl Pearson’s Tables [17]. 

All these studies are confined to the normal distribution and allow no conclu- 
sion about the asymptotic distribution of the range. According to Kendall [11] 
it is not known whether such forms exist and what they are. This question may 
at once be answered for a special case. If the distribution is limited to the left 
(or to the right), the asymptotic distribution of the range is equal to the asymp- 








386 E. J. GUMBEL 





totic distribution of the largest (smallest) value. The asymptotic distribution 
of the range exists provided that an asymptotic distribution of the largest 
(smallest) value exists. For the exponential distribution, and for initial dis. 
tributions of the Pareto type, for example, the asymptotic distribution of the 
range is equal to the asymptotic distribution of the largest value. The asymp. 
totic distribution of the range for the rectangular distribution has been derived 
by A. G. Carlton [3]. 

























4. The asymptotic distribution of the reduced range for a symmetrical 
variate. Instead of the procedures mentioned in the last paragraph, let us 
consider a large sample. It is generally assumed that the smallest and the 
largest values are independent in that case. L. H. C. Tippett [20] has shown 
that the correlation between the extremes is negligible for the normal distribution 
and for sample sizes n = 200. In a previous note [9] it has been shown that 
independence holds for large samples and for initial distributions of the ex- 
ponential type unlimited in both directions and possessing all moments. Then 
the joint distribution (1) splits into the product of the asymptotic distribution 
fi(a1) of the smallest value x, and the asymptotic distribution f,(x,) of the largest 
value x, 


(4) (21 ’ Z) = fila) *fn(Xn). 


If, furthermore, the initial distribution is symmetrical about zero, the two 
asymptotic distributions are 














(5) fi(as) = aexpla(x, + u) —e*™*™]; f,.(x,) = aexp[— a(x, —u) —@ *"™), 


These asymptotic distributions and the corresponding probabilities are traced, 
in a reduced scale, on Graphs (1) and (2). 

Since the two parameters u and a will exist also in the asymptotic distribution 
of the range, their nature must briefly be explained. The value wu is defined as 
the solution of 








| 


(6) ®(u) = 1—-. 
n 


Since 







(6’) n(1 — &(u)) = 1, 


the largest value wu may be called the expected largest value. It differs, of course, 
from the mean of the largest value. It has been shown [6] that wu increases as 
a function of the logarithm of n, the function depending upon the initial dis- 
tribution. 

Criteria for the approach of the distribution of the largest value toward its 
asymptotic form have been given by R. A. Fisher and L. H. C. Tippett [4]. 











DISTRIBUTION OF RANGE 387 


For our purpose it is sufficient to consider whether n is so large that u is very 
near to the most probable largest value Z, obtained from 

n- 1 (En) 
" ‘ ¢ (En 
(7) ~ g(%n) — ae ° 

(z,) (Zn) 
If 

In TU 

holds with sufficient approximation, 2u may be interpreted as the range of the 


modes for an initial symmetrical distribution. 
The parameter a defined by 


_ __ o(u) 
~ 1 — &(u) 


also is a function of n. Three cases have to be distinguished: In the first case, a 
is a constant, or converges with n toward a constant different from zero. In the 
second (and third) case, a increases with n without limit (decreases with n 
toward zero). The three cases correspond to three classes of initial distributions 
of the exponential type. The function a is related to the asymptotic standard 
error of the largest, and of the smallest value by 


(8) a 


2 
(9) ao, =aa =<. 
j 


If a increases (decreases) with n, or is independent of n, the standard error of 
the largest value decreases (increases) with the sample size, or is independent 
of it. This behavior has nothing to do with the fact that the standard error of 
the mean decreases, of course, with an increasing number of samples. 

The determination of the constants u and a from equations (6), (7), (8) is 
based on the knowledge of the initial distribution and the sample size n from 
which we take the largest observation. This method cannot be used in many 
practical applications: 1) It may happen that the initial distribution, or the 
parameters it contains, are unknown. ‘Therefore the parameters of the largest 
value cannot be obtained from it. 2) The initial distribution might be known, 
but the number of observations is insufficient to warrant this procedure, because 
the most probable largest value 7, differs from the expected value u. In these 
cases the parameters u and a have to be estimated from the observed distribution 
of the largest value alone. A similar procedure will be used for the range in 
paragraph 7. 

From (4) and (5) the joint asymptotic distribution w(x, w) of the smallest 
value x, and the range w becomes 


w(x, w) = o exp[—a(w —_ 2u) — erate) _ g werte-=) 


The asymptotic distribution g(w) of the range alone is, dropping the index 1, 


+0 
(4’) g(w) a a oo [ exp[—e*@™ ng Soto, de 








388 E. J. GUMBEL 





This distribution contains the two parameters a and u existing in the asymptotic 
distribution of the largest value. To eliminate the two parameters, a reduced 
range F is introduced by 


(10) R = a(w — 2u). 








The range w is a positive variate unlimited toward the right. The reduced 
range F is also unlimited toward the right yet limited toward the left by 


(10’) 





















R= —2au. 





The reduced range is not related to one of the averages of the range. It is the 
range minus the range of the modes divided by a factor which is proportional to 
the standard error of the extreme value. The distribution ¥(F) of the reduced 
range FR, and the distribution g(w) of the range w are related by 


(11) V(R) = *g(w), 


subject to restriction (10’), whereas the probability Y(R) of the reduced range to 
be equal to, or less than RF is equal to the corresponding expression G(w) for the 
range proper 


(11’) 


For the integration in (4’) we put 


W(R) = Glu). 





aja +w-—u)= -y 
whence, from (10), 
alfa +u) = —y — R. 
The asymptotic distribution of the reduced range becomes 
+00 
(12) vip) =e" [ expl-e 6" dy 
and the asymptotic probability ¥(R) of the range is 
+20 

(13) Vv(R) = | exp[y — e”’ — e ” “| dy 
an expression which may easily be verified by differentiation. 

The asymptotic formulas (12) and (13) hold for any initial symmetrical dis- 
tribution of the exponential type, for example, for the normal and the logistic 


distribution (see par. 7). The mean reduced range R and the higher moments 
of the reduced range are easily obtained from the mean @, the variance o,, and 












DISTRIBUTION OF RANGE 389 


the invariants \, of order v of the range proper w given in a previous paper [8]. 
They are 


(14) D = 2Qu +77; 


2 -)IG1 | 


a’ kat k”’ 


(15) ry 


where y stands for Euler’s constant. 
Consequently the mean R, the variance o% and the invariants \, of the reduced 
range are 
2 co 
= 1 
(16) R = 27; da = % =2v—1)! d=; 
3 ent &” 
Equation (14) leads to an interpretation of the reduction (10) which may be 
written 


y2=2 


R = a(w — ®) + 2y 
or 


* w- @ 


(14’) R= V3 “ + 2y 


Thus the transformation (10) is a linear function of the standard transformation 
(w — ®)/c~ usual in statistics. 


5. The probability of the range as a Bessel function. The integrals 
(12) and (13) may be evaluated by numerical procedures, since tables of the 
function exp(—e ”) are easily calculated. However, it turned out to be simpler 
to relate these integrals to the solution of a differential equation. The deriv- 
ative y’(R) of the distribution (12) is 


+00 
y’(R) = —¥(R) +e” [ expl-y — R — e’ — &* "| dy 
The integral is equal to the probability Y(R) since the transformation 


y+tR=-z 


leads to 


+00 + 
[ expl-y — R-—e* —e" “| dy= [ exp[z — ¢* “ — é] dz 


Consequently the probability ¥(R) is subject to the differential equation 
(17) w+ —e*W=0. 
















390 E. J. GUMBEL 





The mode of the reduced range is a fixed value R such that 
(18) WR) = &*0(R). 


Mr. W. Wasow (Swarthmore College) has drawn my attention to the fact that 
the probability ¥(R) of the range can be expressed in terms of a Bessel function? 
To obtain this simplification of the differential equation we introduce a new 
positive variable z by 


(19) z = 2¢%”? 

and a new function U by 

(20) Vv = U-z. 

The boundary conditions are 

(21) z=0,¥7 = 1; Z= @; v= 0. 


The first derivative becomes, from (19) 


z dv 
2 dz 






ay _ 
dR 


whence, from (20) 


dv Zz ‘ dU 
dR ~ - 3 (2 +2%), 
The second derivative becomes, by the same procedure 
dv __izd -@ -£%) 
dR? = 2dz 2 2 dz} 


The second member may be written 


z(U . 8dU , 2@U 2zU . 32dU . 2 @U 
2 “ts 2 dz) 


2 dz 2 dz 
Thus the differential equation (17) is now 
2dU , 3'dU _z2dU , 2U _ 2U 





ro dz 4 dz’ 


4dz°4d2 2d°4 2 
Multiplication by 427" leads to 
d°U 
@ 
This is one of the classical Bessel differential equations of order 1. In the nota- 
tion used by the British Tables [14] (pp. 264 and 213) one of the solutions is 
(22) U(z) = Kilz), 


3 
—-U=Q0. 
4 


“dU 


(21’) +z _* (° +1)U =0. 






2 I profit of this occasion to thank him for this and other valuable suggestions. 


























DISTRIBUTION OF RANGE 391 


where K;,(z), the modified Bessel function of the second kind (Hankelfunction) 
is defined by 


aa 2v+1 
(23) Ki) = (y — lg 2 +1g 2) <<a (5) 


1 fay, 1a 

Tt, ~ (v — 1)!v! \2 sr" > ee 
The relation between the functions K,(z) and the Hankelfunction H‘”(z) is 
(23a) K,(2) = 50 Hy (iz). 


The asymptotic probability for the range is, from (20) and (22), 


(24) W(R) = 2K,(z) 
or, from (19) 
(25) W(R) = 2e**K,(2e7*). 


This is the only Bessel function satisfying the boundary conditions (21). The 
asymptotic probability ¥(R) of the range may be written finally from (25), (23) 
and (10) 





oo 


(25a) 1 — ¥(R) = SPCR) (pe — 9 + 28, - 1) 





The distribution 





dv(R) dz 
dz dR 
of the reduced range RF is, from (24) and (19) 


W(R) = 







v(R) = —5 (K@) + 2Ki@). 


Now, the derivative K;(z) is linked to the modified Bessel function Ko(z) of 
the second kind and of order zero by 
2Ki(z) = —K,(z) — zKo(z). 


Consequently the distribution is 





rw 


(26) WR) = = Koz) 


‘ 





bo] 


392 E. J. GUMBEL 


or, from (19), 
(27) ¥(R) = 2e"Ky(2e*’”) 
where the function K,(z) is defined by 


8) Kole) = -(y- 2 + 1g) & @] z 


viv! 


= (2z\” 1 1 1 
+¥(j) tilit gt 3) 


Finally the asymptotic distribution ¥y(R) of the reduced range may be written 
from (27) and (28) 


(28a) jp = FES TNE og ~ 9) + 2) 


0 vip 


We first investigate the analytic behavior and the order of magnitude of the 
probability ¥(#) and the distribution ¥(R) for large negative, and large positive 
values of the reduced range, i.e. for large and small values of the positive variable 
z. If zisso large that 

(3R/2) 


29 a e1 
(29) s- « 
the expressions for K,(z) and Ko(z) become [14], p. 271, 


> ‘ 15 
= - 32 a 


, 1 9 
Ko(z) = : , : + _ 


The probability Y(R) becomes, from (24) and (19), 


(25’) v(R) _ / rt exp| — ale Qe F >| ¢ 4.3 eel cat 15 y 


i 

16 512 
The condition (29) holds, say, for R = —4. The numerical calculation leads, 
for ¥(—4), to the order of magnitude 10°. 


In the same way, the distribution ¥(R) becomes, from (26) and (19), for large 
negative reduced ranges 


R/2 
(27’) y(R) _ \/ x a _ Qe “| (: a a 
L 4 16 

This expression cannot be obtained from (25’) since the approximations for 

Ko(z) and K,(z) used do not fulfill the relations between the derivatives given 
above. The order of magnitude of ¥(—4) is 10°”. 

Thus the probability ¥(R) and the distribution Y(R) may be neglected for 

Rk =< —4. This removes the importance of the lower limit R = —2aw stated 





DISTRIBUTION OF RANGE 393 


in (10’). If au = 2, the distribution of the range may be dealt with as if it 
were practically unlimited toward the left. 
For large positive reduced ranges to which correspond small values of z, say 


(29’) 2? = 8 KI 

the Bessel functions K,(z) and Ko(z) become, from (23) and (28) 
wera (v+ue2) (542) 4!-(E4%) 

° ” Vv b5)\2" 6/2 a'G@ 
28’) Ko) = — +e) +c) +e4 

( _ 7 oe’ a) °a T° 


In this case we are interested to know how far the probability ¥(R) differs from 
unity. Consequently we calculate 1 — W(R) and obtain, from (24) and (23’) 


2 Zz 2 1 52" 


The right side becomes, from (19) 


—R ae R 5) 1 5 | 
2¢ ( rt AVQ4S +5 tT 3¢ 
—R 
= ole — 27) ¢ + 5) + i + ae | 


or 


a? 5 e = 36728 


If R is so large that 
e"<1 
we simply have 
(25’’) 1—W(R) =e “(R — 2y 41). 


For example, for R = 10, the preceding condition is satisfied and 1 — ¥(R) is 
of the order 5.10™*. 


In the same manner we calculate the density of probability y(R) for large 
reduced ranges. From (26), (19) and (28’) we obtain 


W(R) = 26% (2 - 7) ite“) +¢*4 a]. 


By neglecting e “” « R, the right side becomes 
e“[(R — 2y)(1 +e“) + 26°") = e "[R — 27 + &*(R — 2y + 2)] 
whence 


W(R) = e “(R — 2y)(1 + &*) + 2c”. 








394 E. J, GUMBEL 





In first approximation we obtain 
(27”’) W(R) = &"(R — 2y) 

a formula which may also be derived directly from (25’’). The density of 
probability is of the order 10“ for R = 10. 


From the formulae (25’) and (27’) valid for large negative values of R, and 


from the formulae (25’’) and (27’’) valid for large positive values of R follow the 
boundary conditions 


lim ¥(R) a rs li ¥(R) R a 27 


new U(R) men i~” By +1 


For the construction of tables of the distribution ¥(R) and the probability 
W(R) of the reduced range it is sufficient to consider the interval 


—-3 < Rk < 10. 


The two functions K,(z) and Ko(z) have been tabulated [14] and [19]. Hence 
the probability and the distribution could be calculated from such tables of the 
Bessel functions. This procedure, however, was only used to obtain boundary 
values. The tables I and Ia are based on computations made in the Calculation 
and Ballistics Department at the Naval Proving Ground Dahlgren by stepwise 
integration of the differential equation (17) using the special Relay Calculator 
of the International Business Machines Corporation.’ 

Table I gives the probability ¥(R) (col. 2) and the distribution y(R) (col. 4) 
for the reduced ranges —3 S R S 10.5 in intervals AR = 0.5. The differences 
AY given in col. 3 are taken from the original figures. 

For different uses it is necessary to know the reduced range as a function of 
its probability. This relation is shown in Table Ia. The first column gives the 
probability, the first line gives the last decimal of this probability, and the cells 
give the reduced range corresponding to the probability obtained from the 
combination of the first column and the first line. For example: The reduced 
range R = —3.20 corresponds to the probability ¥(R) = 0.0002, and the reduced 
range R = 10.44 corresponds to the probability ¥(R) = 0.9997. 

This table may be used for obtaining the percentage points of the reduced 
range. The mode R, the median R calculated by the Naval Proving Ground 
and the mean F obtained from (14) and (10) are 


(30) R = 0.506366440; R = 0.928597642; R = 1.154431330. 


A probability paper for the range may be constructed in the following way: The 
observed ranges w are plotted on the vertical axis; the reduced ranges R on a 
horizontal axis. The abscissa shows the probabilities 


W(R) = G(w) 














3 The author wishes to express his sincere appreciation for the permission to use these 
computations. The original tables give the probability and the distribution to8 significant 


decimal places at intervals AR = 1/100. Lackof space prevents the reproduction of these 
tables. 



























DISTRIBUTION OF RANGE 395 


TABLE I 
Asymptotic Probability and Asymptotic Distribution of the Reduced Range 
1 2 3 4 


Reduced Range Probability Difference Distribution 
R wv (R) AY ¥ (R) 


—3.0 -00050 .00212 


—2.5 -00324 .01057 
-01356 -03386 
.04048 
.09299 
. 17440 
.27973 
.39794 .24075 
.51654 . 23021 
. 62545 . 20346 
.71872 . 16898 


79429 13360 





.85289 . 10157 
.89675 .07483 
- 92867 .05375 
.95136 .03783 
.96721 .02618 
.97810 .01787 


98549 | .01205 











E. J. GUMBEL 


TABLE I—Concluded 
1 2 | 3 


Reduced Range Probability Difference Distribution 
R Ww (R) AW y (R) 





99045 | .00805 
.00330 
.99375 | .00534 
| .00218 

99594 .00351 
| .00143 
.99737 .00230 
.00093 
.99830 .00150 
.00061 
.99891 .00097 
.00039 
.99930 | .00062 
00025 
10. | 99955 .00040 
| .00016 | 

10. .99972 .00026 














corresponding to the reduced ranges R. If the observations follow the theory, 
the observed ranges are scattered around the straight line 


(10’) w= 2u + . 
a 


If the samples are drawn simultaneously, and if there is a constant interval of 
time between the drawings, this interval may be used as unit of time for the 
construction of the return periods 7(R) and ,7(R) of a range equal to, or larger 
than (smaller than) R where 


a Tan olin 
The first (second) notion applies to the range above (below) the median. The 
return periods are shown in an upper parallel to the abscissa. 

A scheme for this paper is given in Fig. 3. Such a paper will allow a graphical 
test for the fit of the observed ranges to our theory, and avoids any numerical 
calculations. Obviously this method may only be used if the initial distribution 
is symmetrical, unlimited, and of the exponential type, and if the sample size 
is so large that the asymptotic distribution holds. 





DISTRIBUTION OF RANGE 397 


6. The range, the midrange, and the extremes. The asymptotic dis- 
tribution (27) of the reduced range was obtained by convolution of the asympto- 
tic distributions (5) of the extremes. The same method leads to the asymptotic 
distribution of the reduced midrange [8] 


(31) => a(x +> Cm 


TABLE IA 
The Reduced Range R as Function of Its Probability V (R) 


eiais .] } 5 | | | 3 | 
| * |—8.20)—-3.12/-3. .00|—2.96—2.92|—2 
|—2.83] —2.64|—2.52|—2.43/—2.36|—2.30|—2.25|—2 


| 
.12|—1.84|—1.65) .ol| 09 .28)—1. 


ie | Ss  saeucelinasiameameiias 
.88|—0.81|—0.75| —0.69| —0.63|—0.58|—0. 
.32|—0.27|—0.22|—0.18 —0.13,—0.09|—0. 
.13} 0.17) 0.22) 0.26) 0.30) 0.34) 0.: 
.55| 0.59} 0.63) 0.68 0.72) 0.76) 0. 














97, 1.02) 1.06) 1.10} 1.15) 1.19 
43} 1.47) 1. gr .67 
.95| 2.01) 2.07; 2.13) 2. .26 


.62| 2.70} 2.79} 2.88} 2.97} 3.07 





.85| 4. .23| 4. 75 
3.45, 6.57) 6.71] 6.87) 7.05 7. 52 
9.10} 9.22} 9.35] 9.50) 9.67 














* These values have not been calculated. 


On the other hand, the asymptotic distributions of the reduced extremes are 
obtained by introducing the transformations 


(32) yi = a(t + u); Yn = a(X, — u) 


into formulas (5). It is interesting to compare these four distributions and four 
probabilities with each other. This is done in Figures 1 and 2. The probability 
and the distribution of the midrange are practically identical with the probability 
and distribution of the smallest value, for small values of the midrange, and 
become practically identical with the probability and distribution of the largest 
value for large values of the midrange. Fig. 2 shows that the asymptotic dis- 
tribution of the reduced range is less asymmetrical than the asymptotic distribu- 
tions of the reduced extremes. 





I “21g 
BAVIUVA asonaay 
2 ‘ ° 


-——_— «(A mwa $2 isabuvg 
SS ee OL SMWwA isaTnwws¢ 
GM3Ionasy BHA do SaLIMNaveoky 


a 

z 

= 

2 

5 ; 

: = 

c 
“ 

S < 


4 





DISTRIBUTION OF RANGE 399 


Table II contains some characteristic values for these four asymptotic dis- 
tributions. The first three columns are obtained from previous publications 
(6, 8]. The mean range is equal to the range of the means for the extremes. 
The median of the range is larger than the range from the median of the largest 
to the median of the smallest value. The mode of the range is slightly smaller 
than the mean of the largest value. These statements hold, of course, only for 
the reduced variates. 


ao3 -© 


8 


DistRiBUTION oF THE Reouceo 
—"—-—— SMALLEST Vawe y, 


—-—-—-— LARGEST Vawe y, 


5 
3 
£ 
8 
z 
3 
4 


Ranae R- y,.-¥, 
MIDRANGE Vey, + y, 


, EXPECTATIONS \ 
. a \| 


2 ° 2 
Reovcep VaRIATE 
Fig. 2 


From the mode R of the reduced range given in equation (30) and the trans- 
formation (10), the mode Ww of the range itself is obtained as 


o=%+" 
a 


whereas the difference of the modes of the largest and of the smallest values is 


~~ = 1 = 2u. 
Consequently 


(33) + 





E. J. GUMBEL 


Bonw)yy @HL 


wos UAdwy = ALMIGWaodYy 


vw 4oO BWwAHIS 


S oz or s € 


v 
~~ (u) a oonag NHNL3Yy 


. 


oot © s 


& ©0001 005 ooz 
= 





DISTRIBUTION OF RANGE 401 


For a symmetrical initial distribution of the exponential type the mode of the 
range converges toward the range of the modes of the smallest and of the largest 
value, provided that the parameter a increases without limit with the sample 
size. Thus this convergence does not hold for all symmetrical distributions. 
The last two lines in Table II give the four probabilities corresponding to the 
intervals from the mean yw minus once (twice) the standard deviation o, up to 
the mean plus once (twice) the standard deviation. The first probability for 


TABLE II 
Characteristics for the 4 Asymptotic Reduced Distributions 





1 2 3 | + 5 
Characteristic Largest Value | Smallest Value Midrange Range 





Mode 0 | 0 .506 





Expectation y = .57722 | = —.57722| Qy = 1.15444 . 





| | 
Median —Iglg2 = .36651 | = —.36651| 


Seminvariant char. r(1 — t) r(1 + 2#) 
function 








aa 
— = 1.6449 
6 64493 


Variance 








First + second mo- A, = 
ment quotient Be 


95% Probability 


99% Probability 


F(u +o) — F(u — a) 











F(u + 2c) — F(u — 2c) § .90 


the four distributions is about the same as for the normal distribution. The 
second probability for the range and the midrange is about the same as for 
the normal one. 


7. The asymptotic distribution of the range for a symmetrical variate. 
The asymptotic distribution of the range R is, of course, independent of 
the sample size, and parameter-free. Both statements do not hold for the 
distribution g(w) of the range proper which is, from (11) 


(34) g(w) = affa(w — 2u)). 


In this formula, the range is expressed in the same units as the initial variate. 
The parameters a and u are functions of the sample size n, the function depending 








402 E. J. GUMBEL 
















upon the initial distribution. From equations (6), (8), (14) follows that an 
increase of the sample size has two influences on the distribution of the range, 
The increase of the parameter wu shifts the distribution toward the right without 
changing its form, whereas the parameter a influences the shape of the distribu- 
tion. If @ increases (decreases) with n, the distribution of the range shrinks 
(spreads) with increasing sample size. If a is independent of n, an increase of 
the sample size does not change the shape of the distribution. Only in the first 
case may we increase the precision of the range by increasing the sample size. 
The two parameters thus influence the range in the same way as they influence 
the extreme values. 

To use equation (34) for a given initial distribution and a given sample size, 
we have to determine the expected largest value u and the parameter a as func- 
tions of n. We may use the definitions (6), (7), (8) if the initial distribution igs 
known and of the exponential type, and if the sample size is so large that the 
most probable largest value is sufficiently near to the solution of (7). 

As a first example, consider the so-called logistic distribution. This prob- 





ability is 

(35) @(z) = (1 + e*)". 

The initial distribution is 

(35’) g(x) = &(x)(1 — &(z)) 

and the derivative is 

(35’’) g(x) = &(x)(1 — &(x))(1 — 28(2)). 






Equation (6) becomes 














l+e*= 


n—1l 





whence the expected largest value 
(36) u = Ig(n — 1). 


The most probable largest value %, for n observations is obtained from (7). 
This equation becomes, from equation (35) 


(n — 1)(1 — ®(%,)) = —1 + 28(2,) 


n 







whence $(Z,) = i 
Equation (35) leads to the most probable largest value 
(36’) Z = Ign. 


Even for 7 as small as 30 the difference between Z, and u is less than 1%. Con- 
sequently the asymptotic form of the distribution of the range may be used even 
for small samples. The two parameters are 









n 


n+1° 





(37) u=lgn; a= 








Sir 
tio 
an 
siz 


the 


1) 


DISTRIBUTION OF RANGE 403 


Since a converges toward unity, an increase of the sample size shifts the distribu- 
tion of the range toward the right without influencing its shape: the precision of 
any estimate made from the range cannot be increased by increasing the sample 
size. 

The characteristic ranges introduced in paragraph 5 are obtained immediately: 
the mean @, the mode w, the median range # and the ranges w.95 and w.99 


® = Ign + 1.154; w = len +.506; 
w= Ign+ 929; woe =lgn+ 4.46; wo = lgn-+ 6.45 


are parallel straight lines if traced as functions of the sample size n on semi- 
logarithmic paper. 

For the normal distribution we cannot expect such simple results. Here, u 
and a can only be calculated as numerical functions of n although limiting forms 
of these functions are known. The parameter a increases with n, and the 
standard error of the range decreases without limit although very slowly. The 
logistic distribution belongs to the first, the normal distribution to the second 
class of initial distributions of the exponential type. 

The probabilities and the distributions of the range for normal samples of 
size 5, 10, and 20 as calculated by E. S. Pearson and H. O. Hartley [16] are 
traced in Figures 4 and 5. Our aim is to trace the corresponding asymptotic 
probabilities and distributions in order to see how far the asymptotic ranges 
differ from the exact ones. However, we have first to settle the preliminary 
question how far the most probable largest value Z, differs from the expected 
largest value u. The most probable largest value Z, is obtained from (7) which 
becomes, for the normal distribution, 


(38) E,P(Zn) = (n — 1)g(Fn). 


The results Z, as functions of n are shown in Table III cols. 1 and 2. The 
expected values wu obtained from (6) are given in col. 3. For small samples, the 
two values Z, and wu differ widely, as might be expected. We are inclined to 
conclude that the asymptotic distribution of the range cannot hold for small 
samples. However, the only legitimate conclusion to be drawn is, that we can- 
not calculate the two parameters in the way stated before (6) and (8). Instead, 
we estimate them directly from the observations. The question of the most effi- 
cient estimates of these parameters is not yet solved. The simplest way is to 
use the mean range @, and the standard deviation of the range ow,, as given by 
Tippett [20] and Pearson [15]. To distinguish these estimates from the asympto- 
tic values, we write the estimates with an index n. From (14) we obtain 


(39) i = V3 Ow,n > Qu = Wn = 2y - 


Qn T Qn 


Table III gives the calculated means w, and standard deviations o.,, of the 
range, and the estimates 1/a, and 2u,. Fig. 6 shows how the most probable 













404 E. J. GUMBEL 





largest values 7, approach the expected largest value u with increasing sample 
size. The estimate u, quickly approaches u. Besides we trace the mean range 
@, , the standard error of the range o»,, , and 1/a, which is proportional to it, 


LLiti tii 


PROBABILITY 


THE Range mw Norma 
SAMPLES oF Size 


W=5,10, 20, Fo, 100 | 


— 


Exact AND LimitING 
PROBABILITIES FOR 
A. 










RawnGae 
Fig. 4 


From col. 8 follows that the condition au = 2 is fulfilled from n = 6 onward. 
The ranges obtained from the transformations 


R 
(40) w=2u, + 
An 


are given in Table IV, cols. 3-7. The asymptotic probabilities of the range as 
obtained from the combination of columns 3-7, and col. 2 of Table IV are traced 

















DISTRIBUTION OF RANGE 405 


in Fig. 4 as separated points. The asymptotic probabilities are situated very 
near to the exact ones. Therefore the same method was used to calculate the 
asymptotic probabilities of the range for n = 50 and n = 100 which have not 
been calculated by Pearson. They too are traced in Fig. 4. 


z 





| 
| 
| 

+ 
| 


| 
| 
| OrstRiBuTION oF me Norma Rance | 
Exact 
AsymPptoric 
FOR SAMPLES of ze 


a3, 10, 20, Jo, to 





3) 





> 
z- 
j 
8 
t 
> 
E- 
é 
2 
8 











The asymptotic probabilities of the range hold even for small normal samples. 
However, the parameters obtained from the exact distribution differ considerably 
from their asymptotic values. In other words: The asymptotic probabilities of the 
range hold even for small normal samples provided that the parameters are taken 
from the observations. 

To compare the asymptotic distributions of the normal range to the calculated 
distributions, we attribute the asymptotic differences AV/a, for a unit interval 
Aw = 1 to the middle of the corresponding intervals. The results are traced in 
Fig. 5 for n = 5, 10, 20, 50, 100. On the other hand, we take the differences 





406 E. J. GUMBEL 


Av for unit intervals from Pearson’s tables, and trace them in the same graph. 
The fit of the calculated to the asymptotic values may be considered satisfactory, 
TABLE III 


Estimate of Parameters from the Calculated Distributions 
of the Normal Range 





2 3 4 5 6 | 7 


Largest Value Estimated parameters 
Sample |—_—_—_________—__ |Mean Range Standard of the range 
size n Moda’ Expected Wn 

Xn u 


deviation 


sine 1/en Qua 


.765 .431 1.693 . 8884 .4898 .128 

.938 .674 2.059 .8798 | .4851 .499 
1.061 842 2.326 .8641 .4764 | 1.776 
1.419 | 1.282 3.078 797 4389 | 2.571 
1.740 | 1.645 3.735 129 .402 | 3.271 
2.126 | 2.054 4.498 .653 .360 | 4.082 
2.377 | 2.326 5.015 .605 334 | 4.630 




















TABLE IV 
Asymptotic Probabilities for Normal Ranges Taken from Small Samples 








1 2 3 4 5 | é 





Probability Normal ranges w = 2un + R/ap for sample sizes 


G(w) = ¥(R) 





n=5 n= 10 | = 50 |_@ = 100 


.000 135 125 | 
.014 82 69 | 
.093 .30 13 | 
.280 .78 57 | 
.517 52 01 


1 -00 
1 
2 
2 
3 
.719 13 3.45 
3 
4 
4 
5 
5 


rnwon 


IWonrwoawoa hd 


853 21 .89 
.929 .68 .33 
. 967 £16 U7 
.985 .63 .20 
.994 11 64 














or Fr WWN NS 
Oooowk, & Ke WWD DD bd 
OaOaQMnooe & PW 
OArWwWwoqaw 





Fig. 5 shows furthermore how the distributions of the range are shifted toward 
the right and become more concentrated for increasing sample sizes. 

As an example for the practical application of the asymptotic distribution of 
the range, we use an observed distribution of 50 ranges taken from samples of 





9 ‘OL 
Ui BIS Wdhvs 


WY so aivwisa “hn 


Moves sao NON IOSF NN 
uaBiawvavd 


BNA ssabuv “wdaow ™~ 


aabmrve @Hi 40 
NOULVIAaG) « OUMVaNWiAS a5nva Nwan “om 




















Ssmianvs “wrxnoy 
TNWwns wos sbrvy 


@ni ao Susianvavyg BH) 









































Ss Dwusny 


“ 
. 
~ 


om Mx! Yn om 


% 


Q 


“> Nosazasig do saansv3ayy ONy 


” 
a 


° 
“ 

“My ‘ 
' 





408 E. J. GUMBEL 


nm = 14 normal values given in Freeman’s book [5] p. 128. The observed step 
function is traced in Fig. 7. For reasons given in a previous article [7] we 
attribute the cumulative frequency .5 to the smallest range 3, and the cumulative 
frequency 49.5 to the largest range 18. To compare this step function with the 


RANGES 


PROBABILITY 


6 
i 
Z 
2 
3 
hi 
$ 
O 


Sr Stepfunction 
Yne range 
a Probability 4 _ 


probability G(w), we estimate the two parameters u, and a, from formula (39). 
The mean range @, and the estimate s,,,, of the standard deviation of the ranges 
are 
® = 10.68; Sun = 2.93. 
Consequently we obtain, from (39) 
= = 1.61; 2un = 8.82. 


an 















DISTRIBUTION OF RANGE 





The theoretical ranges are thus, from (40), 
w = 8.82 + 1.61 R. 

The corresponding probabilities G(w) taken from Table I are traced in Fig. 7. 

The fit of the theory to the observations is certainly satisfactory, especially if 

we take into account that the ranges are given in integer numbers only. 















8. The mth range and the asymmetrical case. An obvious generalization 
of the theory as established in paragraph 4 consists in the construction of the 
asymptotic distribution of the mth range for an unlimited symmetrical distribu- 
tion of the exponential type. The mth range is the positive distance from the 
mth observation from above, x» , to the mth observation from below, nz. We 
suppose m to be very small compared to the sample size. Under the conditions 
stated in the beginning, the joint distribution w,(m2, 2m) of the mth extreme 
values splits into the product of the asymptotic distribution of the mth extreme 
value from above, fm(xm), by the asymptotic distribution of the mth extreme 
value from below, »f(mx). Here, [6] 


Im(Xm) = am EXP [—Mam(Xm — Um) — meemEn—¥m] 


mi (m2) = &m EXP [Mem(me + Um) — mesm(ne tem) 








The sample size must be so large that the most probable mth extreme value Zn 
is sufficiently near to u,, which is defined as the solution of 
m 


@(un) =1l——. 
n 









The factor a» defined by 





a ¢(Um) 

~ 1 — &(un) 

is related to the asymptotic standard error om of the mth extreme value by 
=— 1 


aAntm = Zz “26 


v=m V 


Am 





The joint asymptotic distribution (mx, %m) of the mth smallest value and the 
mth range 







(41) Wm = Ln — m2 


is 





(m2, Wm) = am CXp [— maim(Wm — 2m) — me*mntttm) _ neem (me tom—tm)) | 






The asymptotic distribution g(w,) of the mth range is, dropping the index m of 
the variable »2, 







+00 
9 onsel — om oa 
g(Wm) = ae MO mp, (W m—2u m) [ exp (— mein ttm) — me Om (2410 m “m)} dx. 
— 00 





410 E. J. GUMBEL 





Again we introduce a reduced range R,, defined by 


(42) Om(Wm — Zum) = Rm = —Zamm 


and put for the integration 


Om (LZ + Um) = Y. 
Then the asymptotic distribution ¥(R,,) of the reduced mth range is 


+00 
(43) WR») = eum exp[— me’ — me’ "| dy. 
The probability V(R,,) for the mth range 


Rm 
U(Rp) = [ "We dz 

cannot be reduced to a single integral. This is due to the fact that the proba- 
bilities of the mth extreme values cannot be written down except in the integral 
form [6]. No differential equation similar to (17) exists. However, the function 
(43) could be calculated by numerical methods. The mean R,, , the generating 
function and the moments of the mth range have been given in a previous 
paper [8]. 

For sake of completeness, consider finally an unlimited asymmetrical initial 
distribution of the exponential type. In this case, the joint distribution of the 
smallest and of the largest value splits again, for large samples, into the product 
of the asymptotic distributions f;(2:) and f,,(z,) of the smallest and of the largest 
values which are now [6] 


f(a) = Q) expla; (2x, — Uy) — emi ai—un)) 


~an(en—tn)] 


Sn(an) = An exp[—an(2n = Un) ~< 


Here, a, and u, are defined, as previously, by (6) and (8). The sample must 
be so large that the most probable smallest value , is sufficiently near to the 
solution of 





@(u;) = :. 
n 
The factor a defined by 
g(u) 
_T B(u;) 


is related to the asymptotic standard error of the smallest value by 


T 
ayo; = /6 ° 
The joint asymptotic distribution of the smallest value x, and the range w 


(21, w) = aan expla; (2, — Uy) — ana + wW— Un) — eID — Ge An@rto—un)) 








ee a ee ee ee ee 


DISTRIBUTION OF RANGE 411 


contains four parameters instead of the two which exist in the symmetrical case. 
However, the number of parameters may be reduced to one. We introduce a 
reduced range R defined by 


(44) R = a,(w — Un + U1) 


being the range itself minus the range of the modes divided by a factor pro- 
portional to the standard error of the largest value. If we put 


(45) a(t) — UW) = Y; = = B 


a) 


the distribution ¥(R) of the reduced range becomes, in the asymmetrical case, 


(46) wR) =<" | exply(t — 8) — & — eM ay 


and the probability ¥(R) for the reduced range is 


+00 
(47) W(R) = I exply — e” — e °¥*] dy 

a formula which may immediately be verified by differentiation with respect to 
R. The mode B& of the range is the solution of 


v(R) =e" [ exply(1 — 28) — R — e” — &®*| dy, 


Contrary to the symmetrical case, the latter integral cannot be expressed by the 
probability, and no simple differential equation similar to (17) exists. The ex- 
pressions (46) and (47) contain a single constant 8 measuring the asymmetry of 
the initial distribution. In the symmetrical case, 8 = 1, we obtain, of course, the 
previous formulas (12) and (13). In the asymmetrical case, the mean, the 
variance, and the higher moments of the mth range may be derived from the 
generating function given in a previous paper [8]. 

The asymptotic distribution of the mth range in the asymmetrical case can 
easily be obtained by combining the two procedures used in this paragraph. 


REFERENCES 


[1] L. von Borrktewicz, ‘‘Variationsbreite und mittlerer Fehler,’’ Sitzungsberichte d. 
Berliner Math. Geselischaft, Vol. 21, (1921). 

[2] —————, ‘‘Die Variationsbreite beim Gauss’schen Fehlergesetz,’’ Nordisk Statistisk 
Tidskrift, Vol. 1 (1922). 

[3] A. G. Caruton, ‘‘Estimating the parameters of a rectangular distribution,’”’ Annals 
of Math. Stat., Vol. 17 (1946). 

[4] R. A. FisHer ann L. H. C. Tippett, “Limiting forms of the frequency distribution 
of the largest or smallest member of a sample’’, Proc. Cambridge Phil. Soc., 
Vol. 24 (1928). 

[5] H. A. Freeman, [ndustrial Statistics, John Wiley and Sons, 1942. 

[6] E. J. GuMBEL, ‘‘Les valeurs extrémes des distributions statistiques”, Ann. Inst. H. 
Poincaré, Vol. 4 (1935). 





412 E. J. GUMBEL 


[7] ————-, ‘“‘On serial numbers’’, Annals of Math. Stat., Vol. 14 (1948). 
[8] —————,, ‘‘Ranges and midranges’’, Annals of Math. Stat., Vol. 15 (1944). 
[9] ————, ‘‘On the independence of the extremes in a sample’’, Annals of Math. Stat., 
Vol. 17 (1946). 
[10] H. O. Hartiey, ‘‘The range in random samples’’, Biometrika, Vol. 32 (1942). 
{11] M. G. Kenpati, The Advanced Theory of Statistics, Vol. 1, London, 1948. 
[12] A. T. McKay any E. S. Pearson, ‘‘A note on the distribution of range in sample sizes 
of n’’, Biometrika, Vol. 25 (1933). 
[13] —, “Distribution of the difference between the extreme observations and the 
sample mean in samples of » from a normal universe’’, Biometrika, Vol. 27 
(1935). 
[14] British ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE, Mathematical Tables Vol. 
VI: Bessel Functions, Part I: Functions of order zero and unity, Cambridge 
Univ. Press, 1937. 
[15] E. S. Pearson, ‘‘A further note on the distribution of range in samples taken froma 
normal population’’, Biometrika, Vol. 18 (1926). 
[16] E. 8S. Pearson anv H. O. Hart ey, ‘“‘The probability integral of the range in samples 
of n observations from a normal population,” Biometrika, Vol. 32 (1942). 
[17] Karu Pearson, Tables for Statisticians and Biometricians, Part II, Cambridge Univ. 
Press, 1931. 
[18] StrupENT, ‘‘Errors in routine analysis’’, Biometrika, Vol. 19 (1927). 
[19] ARNoLpD N. Lowan (technical Director), Table of the Bessel Functions Ko(x) and K,(z) 
for x between zero and one, Math. Tables Proj., New York. 
[20] L. H. C. Tippett, ‘‘On the extreme individuals and the range of samples taken from a 
normal population’’, Biometrika, Vol. 17 (1925). 


ADDITION AT PROOF READING: 


G. Elfving’s article “The asymptotical distribution of range in samples from a normal 
population”, Biometrika, Vol. 35 (1947), appeared when this manuscript was ready for 


print. Elfving considers a probability transformation of the range whereas we deal with 
the range itself. His distribution requires the knowledge of the initial distribution and 
of the sample size, whereas this knowledge is not required in our asymptotic formula. 





LOW MOMENTS FOR SMALL SAMPLES: A COMPARATIVE STUDY OF 
ORDER STATISTICS 


By Ceci Hastings, Jr., FRepERIcK MostTELLeR, JoHN W. TUKEY, 
AND CHARLES P. WINSOR 


Douglas Aircraft Co., Harvard University, Princeton University, 
and The Johns Hopkins University 


1. Summary. The means, variances, and covariances for samples of size 
< 10 from the normal distribution, a selected long-tailed distribution, and the 
uniform distribution are tabled and compared with the usual asymptotic ap- 
proximations. The methods of computation used and the accuracy expected 
are discussed. Use is made of the representation of an arbitrarily distributed 
variate as a monotone function of a uniformly (rectangularly) distributed vari- 


ate. It is hoped that these tables will encourage experimentation with new 
statistical procedures. 


2. Introduction. Two sorts of statistical procedures have been widely ex- 
ploited in theoretical statistics—first the use of linear and quadratic combina- 
tions of the unordered observations and, second, the use of ranked (ordered) 
observations. Statistics based on ordered observations have recently been 
dubbed systematic statistics [2, Mosteller, 1946]. Analytic processes and a few 
necessary numerical tables have advanced the study of the first procedure greatly, 
at least for the special case of the normal distribution; but analytic procedures 
have not done much to exhibit the behavior of systematic statistics and the neces- 
sary tables have been lacking. 

It would be very helpful to have (1) at least the first two moments (including 
product moments) of the order statistics, and (2) tables of the percentage points 
of their distributions, for samples of sizes from 1 to some moderately large value 
such as 100 and for a large representative family of distributions. This is a 
large order and will require much computation. 

The first step in this direction was taken by Fisher and Yates [1] by tabulating 
the means, to two decimal places, of all order statistics from normal samples of 
size < 50. The present paper continues the process by supplying all means, 
variances, and covariances for samples of size < 10 from (a) the normal dis- 
tribution, (b) the uniform (rectangular) distribution, (c) a special distribution 
with long tails. For purposes of comparison, we also supply approximate 
means, variances, and covariances for the uniform and the special distribution 
computed from suitable asymptotic formulas. 

The special distribution has the representing function 


(1) r(u) sie (1 ae ey —_ oo 
where u has the uniform distribution on the interval [0, 1], and x = r(u) is the 


variable whose order statistics interest us. This special distribution was es- 
413 















414 HASTINGS, MOSTELLER, TUKEY, AND WINSOR 


pecially constructed 1) to have high tails and 2) to provide moments of order 
statistics in closed form which could be evaluated with a reasonable amount of 
labor. The normal distribution is rather unreasonable in this latter respect— 
there being no known expression except in terms of single and double quadra- 
tures of some considerable numerical difficulty. 

We have restricted ourselves to samples of size < 10, and to only three dis- 
tributions, all of these symmetrical, because of limited man-power rather than 
limited interest. Additional tables of a similar nature will surely prove helpful. 

In order to obtain even as much information as provided in this paper, it has 
been necessary to make a joint effort, dividing the labor. The various parts of 
the work have been carried out more or less separately by the various authors— 
the means and variances for the normal by Mosteller, the covariances for the 
normal (which, with their double quadratures, required far more time than all 
the other thought and computation combined) by Hastings with some assistance 
from Mosteller, the choice of the special distribution by Tukey, and the com- 
putation for it by Winsor. 


3. Results. In this section we provide the various tables that have been 
computed. 

Table I gives the mean and standard deviation of the 7th order statistic 
x(i|n), [or %ijn, We use whichever notation seems less likely to confuse and 
agree that x(1|n) > 2(2|n) > --- > a(n n)] from a sample of size n drawn 
from a uniform (U), normal (V), and a special distribution (S). All three 
distributions have been adjusted to have zero mean and unit variance. In 
addition Table I gives approximations for the mean and standard deviation as 
computed from asymptotic formulas for the normal (AN) and the special (AS). 

If f(x) is the density function, the asymptotic approximation for the mean 
m(i | n) of the 7th order statistic from a sample of size n is obtained by solving 
the equation 


[. f(x) dx = i/(n + 1) 


for m(z|n). Similarly the formula used for the asymptotic variance of x(z | n) 
is 


ui(n— t+ 1) 


n(n + 1){flm(i|n)]}°" 


Values are given for n = 1, 2,---,10 andz = 1,::-, 1 | If m (| n) is 


an entry in the table for means, a missing entry m(n —i+ 1|n) = —m(i|n); 
if w(z | n) is an entry in the table of standard deviations, a missing entry 
wn—-iti1|n) = wlt|n). 


Table II gives the variances and covariances of the order statistics for the 
normal distribution (NV) and the same quantities as approximated by the asymp- 






























Mean 


n t AN 











.57735 
4307 








. 86603 
6745 



















.8416 





.34641  .29701 





2533 









. 9674 





.57735 
4307 


. 96419 


. 84628 


.49502 





LOW MOMENTS FOR SMALL SAMPLES 


AS 








- 53493 
.3418 


.80240 
.5466 


1.03923 1.02938 





- 98473 
. 6954 









. 25540 
. 1992 


1.15470 1.16296 


1.12449 
.8136 








42567 
.3418 








1.23718 
1.0676 





.74231 
. 9659 











. 1800 


1.26721 


.64176 


1.23847 
.9114 


.55458 
.4539 





3 .24744 .20155 = .16785 


. 1412 


TABLE I 
Means and standard deviations of order statistics x(i\n) for uniform distribution 


(U), normal (N), special (S), asymptotic normal (AN), 
asymptotic special (AS) 


1.00000 


: ‘ .84490 






. -44903 





Standard Deviation 


U 





N 
AN AS 


1.00000 
1.2533 


S 





1.00000 
. 9804 
















81650 82565 


.9168 7486 








.67082 .74798  .82783 
. 7867 . 6823 


.77460 =.66983 =. 58457 
.7236 .5660 





.56569  .70122 
7144 


.82982 
. 6542 

















. 52582 
5035 


.69282  .60038 
. 6340 








.48795  .66898 
. 6670 


.83642 
.6415 




















.61721 = .55814 
.5798 


. 50390 
.4730 















65465 .53557 


.9605 .4384 















.42857 .64492 
.6331 


.84423 
. 6330 








.55328 .52874 .49425 
-5426 .4567 


.60609 .49620 .41648 


.O147 .4057 








415 




































U 
AN 


1.29904 
as 








. 86603 







.43301 










1.34715 
1 





. 96225 








.04735 





. 19245 








. 38564 





.03923 


. 69282 





34641 















.6745 






.3186 





. 7647 





.4307 








1.2816 





.8416 


.0244 





2933 






N 


1.35218 
1504 


10037 


.3927 | 


1.42360 
. 2207 


.85222 





.47282 





. 15251 
1397 





1.48501 
l. 





.93230 





.57197 





27453 












.9957 


. 5462 


.2512 


1.0697 





. 6259 


.3418 


















HASTINGS, MOSTELLER, TUKEY, AND WINSOR 


TABLE I (Continued) 


S 
AS 







1.33506 






. 65892 






.29375 


1.41892 





. 74690 


- 39498 


. 12502 


1094 


1.49358 
1358 


.82317 
6954 


.47995 
4191 


. 22504 
1992 








34427 


.45542 


.51640 


.94433 


.31334 






.47863 


.91168 


. 92223 









U N 


AN 









38188 
. 6072 
















50000 
.9150 









.55902--.46875 ~—-.3 9963 
.4826 3172 


4737 





. 61066 


Standard Deviation 


. 62603 
.6141 


. 50670 
.4359 


45874 
.3617 


S 
AS 


.85217 





- 48992 















37747 





) .85988 


. 9867 .6276 











.4893( 
. 4936 











.4402 


.44807 








)  .48823 








. 38998 


. 4584 .3743 














A447 








. 5091 












.4763 








.4393 





.4227 


.4178 


.43264 


.59780 








.43171 


. 41303 


40751 















.30616 
. 3494 











.86725 
. 6268 









41779 =. 47508 


.48800 
.4361 















.38414 
3122 





.34321 
.3306 


.33173 
. 3268 







































- rh ~~ 


LOW MOMENTS FOR SMALL SAMPLES 


TABLE I (Concluded) 


Standard Deviation 


U N S U N S 
AN AS AN AS 


| 1.41713 1.53875 1.56057 .28748 .58681 .87423 
1.3352 1.1956 .5557 .6275 





1.10222 1.00135  .89062 .38569 .46318  .48859 
. 9085 YE: .4619 .4334 


.78730 =.65608 = .55336 .44536 .41826 .38054 
. 6046 . 4866 .4238 . 3604 





.47238 =—.37572 ~—.30866 .48105 .39756 .33477 
.3488 2754 .4052 .3261 


.15746 .12274 = .09961 .49793 .38857 ~=—-.3 1190 
.1142 -0894 .3973 3117 





totic formulas (AN). The asymptotic covariance between x(i |) and x(j | n) 
is given by 
jin — é + 1) ects 
eee : $$ 4. 
n(n + 1)f[m(i| x) f{m( | n)| 
Symmetry relations exist for supplying the missing entries, 
cov [x(z | n), x(7 | n)] = cov [x(n —7 +1] n), 2e(n —j7 + 1|7n)]. 


It might seem more natural to use the factor n + 2 rather than n in the denomi- 
nator of the asymptotic variances and covariances so that the formulas would 
more nearly agree with those for the uniform distribution. However the use of 
n gives much better approximations for the normal and the special distribution. 

Table III gives the variances and covariances of the order statistics for the 
uniform distribution (U), and Table IV gives the corresponding results for the 
special distribution (S). Table V gives the asymptotic variances and co- 
variances for the special distribution (AS). 

Table VI compares the correlation coefficients between the order statistics 
x(t |n) and 2(j |) for the uniform (U), the normal (N), and the special dis- 
tribution (S). 

It seems worthwhile to call attention to the following: 

(1). Even for n = 10, the asymptotic formulas do not give satisfactory mean 
values for the order statistics. 


(2). For n > 8, the asymptotic standard deviations for the normal are close 





418 HASTINGS, MOSTELLER, TUKEY, AND WINSOR 


enough to be very useful. For the special distribution we must except the two 
order statistics on each end from this statement. 


TABLE II 


Variances and covariances of the order statistics x(i\n) for the 
normal (N) and the asymptotic normal (AN) 





+ 5 | 9 10 


AN| N | AN 


SSS Ee 


























i 
14 








more | ome | wwe | we 





: .08 
-10) .10) i 
.12| .13) .09| .11| 
15) .16 
.07} .08 .07 
.09) . -08) .08| 
-11] .12| .10} .10) 
-14} .14} .12) .12) 
EC AG 




















.07| .08| .06! .06| .05/.05/.04.0a). 
.09| .09| .07/ .08 .06| 07) .05| .06}. 
11] .11} .09) .09) .08} .03| .06| .07 











| 
| 


.12} .13} .11| .11} .09).09) 


15] .16| .13| .13) | || 















































(3). For n > 8, the asymptotic variances and covariances of the normal are 
close enough for many, if not most purposes. 





LOW MOMENTS FOR SMALL SAMPLES 419 


(4). For the special distribution, only the variances and covariances of mod- 
erately central order statistics are adequately given by the asymptotic formulas. 


TABLE III 


Variances and covariances for the uniform distribution (U) 


1 2 4 6 
| 





| .66667| .33333| 


| .45000! .30000 

| .60000 

| 
| 

| .32000) .24000 
.48000) .3: 

| | 

.23810} .19047} .14286| .09522 
.38095) . . 19047 








| 18367] .15306 .09184| .06122| .03061| 
| | .30612 18367] .12245 
.36735| 27551 


| 


| 
| .14583] .12500] .10417| .08333] . .04167| .02083 
| 25000] .20833] .16667| . 08333 
.31250| .25000 
.33333 





| .10370| .08889) .07407| .05925) .04444) .02963) .01481 
.20741| .17778) .14815) .11852) .08889] .05925 
| .26667| .22222| .17778) .13333 





. 29630} .23704 





.08727| .07636) .06545) .05455) .04363) .03273 
.17455} 15273} .13091) .10909) .08727| .06545 

.22909} .19636) .16364| .13091) .09818 
. 26182] .21818] .17455 
27273 











8264! .07438| .06611) .05785) .04959) .04132) .03306) .02479| .01653) .00826 

.14876| .13223) .11570) .09917| .08264) .06611) .04959 
.19835| .17355| .14876| .12397| .09917| .07438 

.23140) .19835) .16529) .13223 

.24793] .20661 











(5). The correlation coefficients change rather little from distribution to dis- 
tribution, the poorest approximation being for end order statistics. 





HASTINGS, MOSTELLER, TUKEY, AND WINSOR 


TABLE IV 
Variances and covariances for the special distribution (S) 


1 2 3 7 





. 28615 





24214 
.34172 





.23277| .14141 
. 27649] .17532 





.69960| .23154| .13655) .10004| .08614 
.25391| .15418) .11490) 
. 20163 


—s 








.07080) 


.23310| .13544| .09667| .07786! 
24429] .14506| .10486) .08514 


. 17345 12762] 
| 


| 
| 
| 


w bo 


.72619} . .13582| .09565| .07517) .06409| 
14065 - 10004] .07913| .06776| | 
.15970| .11509| .09184) 


. 14249) 


-06042) 


One 


_ 





.73940| .23890| .13687| .09562| .07420| .06179} .05471) .05 
.23837| .13850| .09754| .07608| .06359) 
.15208| . 10822} .08499] .07138; 
. 12685} . 10053 





1 
2 
3 
4 


| 
| 
i 
| 
| 











. 24219) .13825} .09608 .07398| .06085) .05266) .04789 
.23814| .13756| .09625) .07443) .06141) .05327) .04852) 
. 14756] .10413} .08097| .06707| .05835) 
.11780} .09225) .07680) 
. 11004 


ofr WN eS 





.13978| .09680| .07414} .06053| .05176| .04604| .04271| .04272 


| .13732| .09565| .07354) .06018| .05156| .04594| .04266 


14481] .10158) .07846) .06444) .05533) .04940) 
.11207| .08707} .07180| .06186) 


| .10016) .08300| 


F | | | 


























or Wh 


4. Methods of calculation and accuracy for the normal distribution. The 
means and variances of the order statistics for the normal distribution were ob- 
tained from direct quadrature of forms like 


[ a*[F (x)}[1 — F(x)\"*" f(x) dz, k = 1,2, 


F(z) = ae [. e** dt and f(x) 






































10 


" 
2 
3 


4 

















a 
NI 1 
ti \ 
~ 

1 .56044 
1 .46550 
2 
1 -42792 
2 





won 





.37715 








mm wh 


1 39389 





1 39286 


LOW MOMENTS FOR SMALL SAMPLES 


2 


- 28022) 





22297 
32038 
. 20168 
- 25347 


. 19167 
. 22368 





. 18667 
. 20861 


. 17527 
. 19004 





. 18276 
. 19382 


. 18226 
. 19019) 








39373 





1 
2 
3 
4 
5 


. 18242 
. 18784 


















TABLE V 
Variances and covariances of the special distribution as computed 


from asymptotic formulas 





.15517 


13444 
- 16898 


. 12579 


- 14679 
19221 


16457 





. 11304 
. 12258 
. 14232 


. 11746 
. 12458 
14011 


11881 
. 12398 
. 13855 


11677 
. 12024 
. 12988 


12105 
13529) 





- 10698 


.11208 


.10147 
12343 





-08394 
-09103 
. 10569 


-08669 
-09194 
10341 
12211 


.08591 
.08965 








- 10019 
11265 


.08560 
-08813 
.09520 
- 10633 








.08341 





-06782 
.07354 
-08538 


-06935 
-07355 
-08272 
-09769 





06829 
.07126 
-07963 
.08958 
. 10680 





.06775 
.06977 
.07536 
.08417 
.09716 





.05814 


.07014 
.08098 





.05873 
.06229 
-07005 


.05727 
.05977 
.06678 
.07512 


.05646 


.06280 


It is believed that the means are correct to within one unit in the fifth decimal 
and that the standard deviations are correct to within 2 or 3 units in the fifth 
decimal. 








-05842| .05388 
.06335 


.05221 





.05092 
.05313 
.05938 


.04891 
.05036 
-05440 
.06076 





The evaluation of the covariances was much more troublesome, 
evaluation of iterated integrals of the form 


[. af x) F(z) f (Ol — FOV at de. 





.05538 














421 








.04924 


-04556 
.04754 


.04367 











-04379 
.04508 
04871 


.04054 
.04174 


-03937 


requiring the 


422 HASTINGS, MOSTELLER, TUKEY, AND WINSOR 


Necessary linear combinations of such forms give rise to considerable loss of 
accuracy. The covariances are believed to be correct to within 1 unit in the 
second decimal (except for one or two values which may be off by two units). 


TABLE VI 


Correlation coefficients X 10% between order statistics x(i | n), x(j | n) for the 
uniform (U), normal (N), and special distribution (S) 





} | | 
1 [58|/55|50)33/29| 23 





| 1 |o1\58|53]41|38, 32) 25, 

















w 


~I ol 
ot Oe 








| 
| 
| 


66|62|56}49|45| 
| | 74/73, 


- on 9 | 
we & 





5046 | 
8| 46) 
79} 93) 62 
| 30} 80) 7 











'37 |32 [33 |: 
60 [57 |: 
80 |79 











74] 62) 61) 
81} 80) § 


| 








Better tables of these covariances are badly needed, and it is hoped that someone 
will provide them. 


The asymptotic values are correct to the two decimals given. 








LOW MOMENTS FOR SMALL SAMPLES 423 









5. Computation in terms of the representing function. It will prove con- 
venient in working with the special distribution, as indeed it does in many 
statistical procedures, to introduce the representing function r(u), which is a 
monotone function such that 






Pr {rwm) Sz rwm)} =w—-Mm,Hw%2> uM. 
Thus if wu has a uniform (= rectangular on [0, 1]) distribution then z = r(u) 
defines a variate with the given distribution. 


The 7th order statistic of n from the uniform distribution, u,), , is distributed 
according to 






i(7)u" “(1 — u)** du, 0 <u <1, 


where it is important to remember that w,, is the largest and not the smallest 
order statistic; and the joint distribution of wu = wij, and v = Uj, (j > 4), 
is given by 







i(j — 2) F iis - = | vw F(u — v1 — uj dud, O<v<uK<l, 





where § ~ | is a multinomial coefficient. 
1j—-in—-j 


The means, variances, and covariances which we desire can be written as 
follows (it is immaterial whether we think of expectations over x’s or over wu’s): 


E(xijn) = E(r(uyn)) = i(") [ r(uju"*(1 — u)** du, 
var (%ijn) = E(ziyn)” — (E(tiyn))” = E(r’(tin)) — Eatin)” 
= i(7) I P(uju™*(L — u)™ du — (E(xin))’, 


COV (Suns Zin) _ E(Xiynd jun) — E( ijn) * E(x jn) 


= E(r(uiyn)r(ujn)) — E(tiyn) E(t jn) 


= i(j — 1) i j-in- lf [ r(u)r(vyo"(u — v)? #71 — u)™ du do 
— E(@ijn)E (jn) 








Introducing E,,; by 





Et -[ [ r(u)r(v)u’v® du dv, 





we have 


Beata) = iG - 5 | 


4J—%N—J 


m 7 (—1)"*" f = ; 1 = ’ Besaa~t-t-b 


km 














424 HASTINGS, MOSTELLER, TUKEY, AND WINSOR 
and, in particular, 
E (21222) = 2Eo0, 
E (154,45) = 60F21 = 120F 1,3 + 60Eo,3 e 
Introducing E,,, by 
1 
E.,e =| r(u)u® du, 
0 


we have 


2 . pa 2 
E (xin) e (") Zz (—1)* (’ k ) En—t+k.n—~i+k 
k 
and, in particular, 


E(x3\s) = 20E3,3 = 20E 4,4 . 
Introducing E, by 


1 
E, = I r(u)u® du, 
0 


we have 


E(xijn) = i(”) dX (-1) (’ k ') En-itk 


and, in particular, 
E(23,5) => 30F2 = 6023 + 30F, ° 


Thus the computation of the desired means, variances, and covariances is 
reduced to the computation of the integrals E, , E,,,, and E,,:. 

We shall also want to calculate the asymptotic approximations to the means, 
variances, and covariances of the order statistics. For the uniform distribution, 
it is well known that 


n—-t+1 
n+1 ’ 

(mn —i+ 1) 
(n + 1)%(n + 2)’ 
a(n —j7 + 1) 
(n + 1)*%(n + 2)’ 
These asymptotic formulas are transformed from u to x by the relations x = 

r(u) and dx = r’(u) du, giving 


approx mean (2%jj,) = (2 —s > ) 
ppro> Lin n a | + 


pp iln n+l (n - 1)*(n + 2)’ 


mean (Uijn) = 


var (Uijn) = 


COV (UijnUjjn) = i <p. 





LOW MOMENTS FOR SMALL SAMPLES 


n—jt ‘) 
n+1 
e(* $+) im —j+i_ 
n+1 (n + 1)%(n + 2)’ 
as noted above, in our calculations we have replaced n + 2 by v in the denomi- 
nator. 


approx COV (TjnZjjn) = 7’ ( 


@< 9), 


6. Reduction of integrals for the special case. When the representing func- 
tion is 


1 
“as (A > 0), 


1 
x=r(u) = ee - 


we obtain a symmetrical distribution with long tails. (For the normal dis- 
tribution r(u) = o(In u) asu—>0). The integrals we want are 


1 
E, = i {1 —u)* — uu du, 
0 
1 
E,,s — [ {(1 = u)> - urate du, 
0 


E.. = [ [ {i —u)* —u*}{a — v)* — ov }u'o' du do, 
which can be expressed as 
E, = A,(d) — BQ), 
Eu. = Ase(A) — 2Bz,e(A) + C,,0(), 
Eu,t = AQ) — Ba QA) — Coed) + Dz, c(d), 
where 


1 
A,(A) = [ (1 —u)*ué du = b(—A, s), 
0 


1 
B,(A) = I uu? du = a7 
1 
Ags(A) = [ (1 — uy ™u* du = b(—2k, s), 
0 


1 
B,(A) = [ (1 —u)*uu’ du = b(—A, s — 2), 


1 


1 
C,0(A) = I o™ us du = s4+i- & + 


1 1 
Agt(A) = [ | (1 — uy *( — v)* uo’ du dv 





HASTINGS, MOSTELLER, TUKEY, AND WINSOR 


a s\ ,_\ bG +1 — 22,2) 
=E(C)O 74i-, 


1 pl 
Bst(A) = I / (1 — u) oul of du dv 


_w(s\,_ywb@+1—dA,t-d) _ bls +t +1—A, —d) 
-=()O i+1—xX — t+1—X ’ 


1 pl 
Co) = I / u*(l — vy uo’ du dv 


ee 
s+l1l-—A 


1 1 
—v’ —h t 
[ [ou vu'v' du dv 
0 v 


1 
(¢++1—A)(s +t+2 — 2X)’ 


where throughout 


{b(—A, t) re b(s +i+ i, —h, —r)}, 


_ p!q! _T@eat)Dr@ +1) _ 
9" ceet” Meters ~ ether. 





7. Calculations for the special distribution. The computations for the special 
distribution were made from the formulas in the preceding section. The quan- 
tities A, B, C, D were computed from r = s = Otor+ s = 8, whence the values 
of E,, E.,, Es, were calculated. The values of the means, variances, and co- 
variances were then obtained from the formulas of section 3. 

The means, variances, and covariances are believed to be accurate to the five 
decimal places given. 


8. Formulas and accuracy for the uniform. The means, variances, and co- 
variances of the uniform are given near the end of section 5. Since r(u) = u, 
they are also the values given by the asymptotic approximation, when n + 2 
is used. 

The tabulated values were computed to six decimal places and rounded to the 
four or five decimals given. 


REFERENCES 


[1] R. A. FisHer anv F. Yates, Statistical Tables, Oliver and Boyd, London, 1943. 
[2] FrepERtIcK MostTE ter, ‘‘On some useful ‘inefficient’ statistics,’’ Annals of Math. Stat., 
Vol. 17 (1946), pp. 377-408. 


a es as 


. 


—_ ~_ £4 Mr oe ® DM co 


a ee ee ee ee ee 





SEQUENTIAL CONFIDENCE INTERVALS FOR THE MEAN OF A NORMAL 
DISTRIBUTION WITH KNOWN VARIANCE . 


By CHARLES STEIN AND ABRAHAM WALD 


Columbia University 


1. Summary. We consider sequential procedures for obtaining confidence 
intervals of prescribed length and confidence coefficient for the mean of a normal 
distribution with known variance. A procedure achieving these aims is called 
optimum if it minimizes the least upper bound (with respect to the mean) of the 
expected number of observations. The result proved is that the usual non- 
sequential procedure is optimum. 


2. Introduction. The problem of sequential confidence sets in general has 
been considered briefly by one of the authors [1]. Let {X.}, (¢ = 1, 2,---), 
be a sequence of random variables whose distribution is specified except for the 
value of a parameter 6 whose range is a space 2. Sequential confidence sets are 
determined by a rule as to when to stop sampling, together with a function of 
the sample whose value is one of a specified class of subsets of 2. The class of 
subsets is chosen in advance depending on the purpose of the estimation. For 
example, it may be the class of all intervals of prescribed length or the class of 
all sets whose diameter does not exceed a given value. It is required that the 
probability that this (random) set covers @ should be greater than or equal to a 
specified confidence coefficient a for all 6. A procedure for finding sequential 
confidence intervals is considered optimum if it minimizes some specified function 
of the expected numbers of observations. Here this function is taken to be the 
least upper bound. In contrast with the result of this paper, a case where se- 
quential confidence intervals may have an advantage over non-sequential pro- 
cedures has been given by one of the authors [2]. The X; are independently 
normally distributed with unknown mean and unknown variance, and the prob- 
lem is to find confidence intervals of fixed length for the unknown mean. As 
was first shown by Dantzig [3] this cannot be accomplished by a non-sequential 
procedure. Another case where this is true is the problem of finding confidence 
intervals of the form (po , kpo) where k is a specified number greater than 1, for 
the probability in a binomial distribution. 

Let {X.}, (@ = 1, 2,---), be independently normally distributed with un- 
known mean ~ and known variance oj. It is desired to specify a sequential 
procedure for obtaining confidence intervals of fixed length | for the mean &. 
This is provided by a rule according to which at each stage of the experiment, 
after obtaining the first m observations X,, --- , Xm for each integral value m, 
one makes one of the following decisions: 

a) Take an (m + 1)st observation. 

b) Terminate the procedure and state that the mean lies in the interval 

427 









428 CHARLES STEIN AND ABRAHAM WALD 


























(Y — 31, Y + $l), where Y = ©n(X1,--- , Xm), Gm being a measurable real- 
valued function. The serial number m of the observation on which the proce- 
dure terminates is, of course, a random variable and will be denoted by n. 

For any relation R the symbol P(R | &) will denote the probability that R 
holds when é is the true mean of X;. The confidence coefficient of a sequential 
procedure S is defined by 


(1) a(S) = gb. P(Y —31<&< Y+l/6). 


Denote by mo(S) the maximum expected number of observations, i.e. 
(2) no(S) = l.u.b. E(n | &, S) 
g 


where E(n | £, S) denotes the expected value of n when £ is the true mean and the 
procedure S is used. 
A procedure S will be considered optimum if, for all S’ such that a(S’) = 


a(S), 
(3) no(S) < n(S’). 


It will be shown that an optimum procedure S(v, c) can be obtained as follows: 
ad) For all m < », a fixed positive integer, take another observation. 
b) For m = », terminate the procedure if 


v v 2 
(4) at a . (= x.) > ca? 
1 1 


1 Vv 
and let Y = : ~X;. (The inequality (4) is used merely as a device for fixing 
1 


the probability of taking v observations, this random event to be independent 
of whether (Y — 41, Y + 31) covers é, given v.) 
c) Otherwise take a (vy + 1)st observation, terminating the process, and let 


1 v+1 


Yai de® 


When c = 0, this is the usual non-sequential procedure. 
Clearly, 


eee en = + [1 — Pl > cllH (a). 








20 201 
where 
ee a fT ha? 
(6) H(u) = ic [e dz = V2 L € dx. 
Also 
(7) nlS(v, c)] =v» +1 —P{xi-1> ce}, 


By a proper choice of v and c we can achieve any desired confidence coefficient 


—_—_— fF Fh 


an 


SEQUENTIAL CONFIDENCE INTERVALS 429 


a>H (Va: \/a0 ). There is no essential loss of generality in considering only the 


case o, = 1, and this will be done in the remainder of this paper. 


3. A lower bound for n(S) and an upper bound for a(S). Consider any 
sequential procedure S for obtaining confidence intervals of length 1. Put 


(8) a(é, S) = P{Y — gl <&< Y + 31/é}. 


That is, a(é, S) is the probability that the confidence interval will cover the true 
mean ~ when the procedure S is used. According to (1), 


(9) a(S) = ath. a(é, S). 
In order to obtain a lower bound for mo(S) and an upper bound for a(S), we 
suppose that the procedure S is applied when £ is not a fixed number but a ran- 


dom variable normally distributed with mean 0 and variance o. Then the 
probability that the confidence interval covers & is 


1 7 
(10) a(o, S) = Jax | e F* a(t, S) dt = a(S) 
and the expected number of observations is 
+00 
(11) ain « Tike [ ¥"" Bq |, 8) a < nS). 


Let pm(é, S), (m = 1, 2,--- , ad. inf.), denote the probability that n = m 
when £ is the true mean and procedure S is used. Put 


= 2 2 
(12) palo, 8) = = fe pale, 8) dt 
Since 
(13) E(n|o, 8) = 2 mpm(s, S) 


we obtain from (11) 
(14) > mins, S) < no(S). 
m=1 


We shall now derive an upper bound for &(c, S). Since X; = — + e; where the 
€; are independently normally distributed with mean 0 and variance 1, the joint 
distribution of — and X;, (¢ = 1, --- , m), is a multivariate normal distribution 
with 


(15) Et = EX; = 0 








430 CHARLES STEIN AND ABRAHAM WALD 
and covariance matrix 
2 


o 
¢ +1 


o +1) 


Thus the conditional distribution of given X,, «++ , Xm is normal with mean 


( _ 





—{i 


(er+1 o . o 
| o +l s- a” 


E(E| X1,°°+,Xm) = (o°,+++,0°) | 









(17) 





=o(1,1,---,1) 








and variance 


2 o 


4 ™ 2 2 
cg AM tan ila, 
-_ ? (mo? + 1)? (> ) mo? + 1 


If X,,--°- , Xm is a sequence for which the process is terminated on the mth 
trial, the conditional probability that the interval of length / will cover & is 
clearly maximized by taking 








2 m 


o . 
mo + | XX 


and, by (18) this probability has the value H(c,,) where H is defined by (6) and 


_ T 1 
(20) Cn = y/m+ 35. 


(19) Y = E(E| X1,°°:,Xm) = 















SEQUENTIAL CONFIDENCE INTERVALS 
Hence, 
(21) a(o, S) < x Dm(o, S)H (em). 
From this and (10) we obtain 
(22) a(S) < 2 Bale, S)H (en). 


This upper limit of a(S) and the lower limit of mo(S) given in (14) will be used 
later to prove that S(v, c) is an optimum procedure. 


4. Maximum value of >. Dm(o, S)H(cm) subject to the condition that 
1 
> mpn(c, S) does not exceed a given bound. We shall show that the maximum 
1 


of >> pn(o, S)H(em) subject to 
1 


E(n | o, 8) = D mpm(o, 8) < v + a, 
1 


where v is a positive integer and 0 < a < 1, is obtained by choosing ja(c, S) = 
p» defined by 


pn =Oform <vorm>v+1 
(23) p, =l—a 

* 

PH = a. 


For, suppose to the contrary that there exists a sequence {pm} such that the 
following conditions hold: 


Pn 20, 2 Pm = 
2d, mpm Sv +a = Limpm 
1 1 


» Pm H (cm) > x Dm H (Cm). 


We have 


= os u2 
H(u) = 2 f ede = 7a | ye” dy. 
rg 


chy 


1 oh al 
C = Hom) — Ho) = = | ye dy. 
rive 


2 
¥ 















432 CHARLES STEIN AND ABRAHAM WALD 


With the aid of p, = 1 — Dong» Pm, We obtain from the last two inequalities 
in (24) 

























. 3 
(27) 0) <. x (Dm aed pm)H (Cm) — C X (Din = pnym = a (Dm — pn)Km W 
( 
where ; 
(28) Km = H(¢m) — H(c») — (m — »)[H(e41) — H(c,)]. . 
Clearly K,41 = 0. Also, for m < », since the integrand is a strictly decreasing 
function of y, (2 
ch+1 eR B 
Kn = 0 - m) | ye ™ dy — / y te dy 
(29) . ‘ wn . 
me C at _— , 3 — a 
< (v m) + ye - (v — m) 5" » 0. 
Similarly form > »+ 1, Kn <0. But Dn = 0 for m + v, vy + 1 so that ( 
(30) 2 (Pn — Pm)Kn <0 
mAv,,v+1 if 
which contradicts (27) since Kyu, = 0. 
Thus, we have shown that the inequality ( 
(31) B(n|o,8)<»+a , 


implies the inequality 


(32) 22 Bm(o, S)H (em) < (1 — a)H (6) + aH (Cr). 
5. Proof that S(v, c) is an optimum procedure. Since, according to (14) 
and (22) [ 
(33) no(S) > E(n|o,8) and a(S) < Do pn(o, S)H (cm), 
1 


it follows from the result expressed in (31) and (32) that, for any procedure S 
satisfying the inequality 


(34) m(S) < » +4, 


we must have 
(35) a(S) < (1 — a)H(c,) + aH(c,41) 
identically in ¢. Since H(z) is continuous, it follows that 

(36) a(S) < (1 - oH vs t+ at Vr FI +i) 


for any procedure S satisfying (34). 








SEQUENTIAL CONFIDENCE INTERVALS 


The right hand side of (36) is a[S(v, c)] where c is chosen so that 
(37) 1—a=P{xi1> cc}. 


We use an indirect proof to show that S(v, c) is an optimum procedure. Sup- 
pose to the contrary that there is a procedure S’ such that 


(38) a(S’) = a[S(», c)] 

but 

(39) mo(S’) < no[S(», c)]. 

By (5) and (7), a[S(v, c)] is a continuous strictly increasing function of 
y+1—P{x¥i1 > c} 

and this latter is no[S(v, c)]. If we choose »v’, c’ so that 


n(S’) < rh +1 — P{x¥1>¢} 
(40) 


<v+1 —P{x-1 > c}, 


it follows that 
(41) a[S(v’, c’)] < a[S(v, c)] = a(S’). 


But (41) and the first part of (40) contradict the result expressed in (34) and 
(36). 


REFERENCES 


[1] A. Wap, Sequential Analysis, John Wiley and Sons, 1947, section 11.2. 

[2] CuaR_Es STEIN, ‘‘A two-sample test for a linear hypothesis whose power is independent 
of the variance’, Annals of Math. Stat., Vol. 16 (1945), pp. 243-258. 

[3] G. B. Dantzie, ‘‘On the non-existence of tests of ‘Student’s’ hypothesis having power 
functions independent of o’’, Annals of Math. Stat., Vol. 11 (1940), p. 186. 











NOTES 


This section is devoted to brief research and expository articles on methodology 
and other short items. 


Sear RR 


A USEFUL CONVERGENCE THEOREM FOR 
PROBABILITY DISTRIBUTIONS 


By Henry ScHEFFE 
University of California at Los Angeles 


In problems of establishing limiting distributions it is often apparent that the 
probability density p,(x) of a random variable X, has a limit p(x); throughout 
this paper n = 1, 2,3, --- , and all limits are taken asn — ». If p(x) is the 
density of a random variable X, what we really care about then is whether the 
limits apply to probabilities, which involve integrals of the densities: Does 
lim Pr{X, in S} = Pr{X in S} for all’ Borel sets S, or, does 


(1) lim | ale) dx = [r@ dx ? 


The question is thus one of taking a limit under an integral sign. Perhaps the 
most widely used justification of such a process is the following theorem of 
Lebesgue [1, p. 47; 2, p. 29]: If for a sequence {f,(x)} of integrable functions, 
lim f,(x) = f(x) for almost all z in S, then a sufficient condition that 


lim [ toleyae = [1@ dx 


is that there exist an integrable function g(x) which uniformly dominates the 
sequence {f,(x)}, thatis, | f(a) | < g(x) for all n and all vin S, and [ g(x) dx <<, 
Ss 


For example, in the excellent new treatise by Cramér the limiti: ¢ form of the 
t-distribution is treated as follows [1, p. 252; other examples «n pp. 369, 
371]: For n degrees of freedom the ¢-variable has the density 


(2) Pa(t) = en(1 + 2 /n)**, 
where : 
(3) Cy = (nx) ?T(3(n + 1))/T(3n). 


It is shown fairly easily that lim p,(x) = p(x), the density of N(O, 1), where 


1 In defining the convergence of a sequence of distributions to the distribution of a dis- 
continuous random variable X it is desirable to modify this requirement so that it is de- 
manded only of sets S which are continuity intervals of X [1, p. 83]. We are concerned here 
however only with the ‘‘absolutely continuous case’’ where X has a probability density p(z). 


434 


ie 


1e 





A CONVERGENCE THEOREM 435 


N(m, o°) denotes the normal distribution with mean m and variance o°. Then 
to prove 


lim [. p(x) dx = [. p(x) dx, 


Cramér shows that {p,(x)} is uniformly dominated by an integrable function. 
It is instructive to consider some examples where 


t 
(4) lim [ pala) dz 
does not equal 
E 
(5) [ lim p(x) dz. 


In the examples (7), (7), (c7), lim p,(x) = 0 for all z and hence (5) is zero for 
all &. 
(t) pn(x) = 1for —n —1< 2 <—n, zeroelsewhere. Then (4) equals 1 for all é. 
(ii) pa(x) = 1/n for —3n < x < 4n,zeroelsewhere. Here (4) equals 3 for all 


(iii) p,(z) = 2n’x for 0 < x < 1/n, zero elsewhere. Now (4) is zero for 
§ < 0, unity for é > 0. 

An example in which lim p,(x) ¥ 0 is 

(iv) pn(x) = 3[hn(x) + po(x)], where h, is the p, of one of the above examples 
and po is a fixed density. Then lim p,(x%) = 3p0(x). Now (4) exceeds (5) by 
half the amount it did in the corresponding above example. 

The essential features of these examples could be obtained with normal 
distributions but would involve a little more computation, for instance, N(—n, 1), 
N(0, n”), N(1/n, 1/n‘), for examples (2), (2), (cdi), respectively. 

We note that in none of these examples is lim p,(x) a density. This suggests 
that the trouble might perhaps be prevented by requiring that lim p,(x) be a 
density—which happens in the case from which we started. This surmise is 
correct. We may formalize the situation as follows: 

DEFINITION. A function f(x) will be called a density if it is non-negative and 


[ f(x) dx = 1. Here R denotes the whole space of x. 
R 


The reader may think of a univariate density, where x is a real variable and 
R is the real axis, but theorem and proof run the same for a k-variate density, 
where x is a point in a k-dimensional Euclidean space R. 

TuEorEM. If for a@ sequence {p,(x)} of densities 


lim p(x) = p(x) 





2 The hypotheses of this theorem, while perfectly adapted to applications in probability 
and statistics, would not seem the ‘‘natural’’ ones in real variable or measure theory. Pro- 
fessor A. P. Morse has remarked to the writer that, if the theorem has not been stated in this 
form before, it is at least an easy corollary of some more general results known in that field. 
Nevertheless our direct proof based only on the familiar Lebesgue theorem and using only 















436 HENRY SCHEFFE 





























for almost all x in R, then a sufficient condition that B 
0 
lim | Pn(x) dx = | p(x) dx, | 
s s Pp 
uniformly for all Borel sets S in R, is that p(x) be a density. 
Proor. Let us write the difference 
T 
(6) Pr(x) — p(x) = 5,(2). . 
Then . 
(7) 5n(x) > 0 
for almost all z in R. Also 
(8) [ eax = | pax - [ par, 
Ss Ss Ss 
and so it suffices to prove that [ 6, dx — 0 uniformly for all S in R, where S e 
Ss f 
henceforth denotes a Borel set. If in (8) we let S = R we get 
s 
(9) [ .dx =0 
R 
I 
since p, and p are densities. We now split the difference 6,(x) into its positive I 
and negative parts: Let ( 
- t 
(10) b= 2. +|dn|), 82 = 3(5n — | 8, |), 


so that 











6, = dn + on, 
From (7) and (10), we find 
(11) 6, — 0 


est © <eeé 





for almost all x in R, and from (9), 


_- > = 2k ee 


(12) [ start fo dx =0. 





very simple manipulations may be of interest to readers of the Annals. Professor Morse 
also pointed out that the stronger result lim [ | pn(x) — p(zx) | dz = 0 uniformly for all S, 
8 


may be stated. This follows from our proof since 


[lp = plar = [or ae— f am ae. 
s s s 











A CONVERGENCE THEOREM 437 





By virtue of (6),6. > —p. Nowifé, < 0,6, = 6, > —p, and if 6, > 0,8 = 
0 > —p, and hence in every case0 > 6, > —p. Since we now have | 6, (x) | < 


p(x) and / p(x) dx = 1, we may apply’ the Lebesgue theorem to get 
R 


lim | 5, d= / lim 6, dz. 
R R 


The right member is zero because of (11). It then follows from (12) that 


lim / 5; dz is also zero. The relations 
R 
0<fstar< [ star—o, 
8 R 
o> [ sax > / &, dx > 0 
Ss R 


guarantee that the quantities | 5¢ dx and / 5, dx have the limit zero uniformly 
Ss Ss 


for all S, and hence the same is true of their sum (8). 

Returning to the example (2), we remark that it is practically obvious that the 
second factor on the right has the limit e~**”, but it is not quite so obvious that 
lim c, = (2r)*. This situation is typical of many applications where it is 
more difficult to evaluate the limit of “the’’ constant than the limit of the re- 
maining factors, and one wonders after obtaining the latter limit whether the 
constant is not automatically forced toward the limit desired for it, and whether 
the direct calculation of its limit could not be avoided. Let us put the question 
as follows: Suppose that 


{Pn(X) = Cnfn(x)} 
is a sequence of densities and that 


p(x) = f(x) 


is also a density. Then if lim f,(x) = f(x) for almost all x, may we conclude 
that lim c, = c? If so, we could then apply the above theorem without having 
evaluated the limit of the constant or produced a dominating function. Un- 
fortunately the answer to this question is no, as shown by example (zv) above: 





3 Although our proof rests on the Lebesgue convergence theorem, this theorem is applied 
xto 6(x) and not to p,»(x). While in most cases of practical interest the sequence {p,(zx)} 
is uniformly dominated by an integrable function, it is possible to devise a simple example 
where this is not true and yet our theorem applies: Let p,(z) = 1 for 1/(n+1)§<2<1 

n 


and for an < 2 < Gny1, zero elswhere, where a, = Z 1/i. Then sup pa(z) = 1 for 
i=l 

all z > 0, nevertheless lim p,(zx) is a density, namely that of the uniform distribution on 

(0, 1). 





438 M. KAC AND A. J. F. SIEGERT 


If we let f.(z) = ha(x) + po(x), and f(x) = po(x), then lim f,(x) = f(x), but 
C, = % and c = 1, hence lim c, ¥ c. Employing the assumption that p,(z) 
and p(x) are densities we see 


ida / tide, them / ae de. 


and hence lim c, = cif and only if 


(13) lim | f(x) dz = | lim f,,(x) dx. 


It follows that in such cases if we wish to establish a limiting distribution in the 
sense (1), we may either prove lim c, = c, or we may justify (13), say by produ- 
cing a suitable dominating function, but we need not do both. No doubt the 
first alternative would be preferable at all but the most advanced levels of 
teaching or exposition. 


REFERENCES 


{t) i. CRAMER, The Mathematical Methods of Statistics, Princeton Univ. Press, 1946. 
2] S. Saks, Theory of the Integral, Stechert, New York, 1937. 





















(a 


AN EXPLICIT REPRESENTATION OF A STATIONARY 
GAUSSIAN PROCESS 


By M. Kac' ann A. J. F. SreGert 





Cornell University and Syracuse University 


1. In a paper which will soon appear in the Journal of Applied Physics [1] 
the authors have introduced methods of calculating certain probability dis- 
tributions which are of importance in the theory of random noise in radio re- 
ceivers. 

The complexity of the physical problem and occasional uses of heuristic reason- 
ings may have obscured some of the mathematical points. For this reason the 
authors felt that it may be worth while to illustrate one of the basic ideas on a 
simple but important example. 

2. A stationary Gaussian process is a one parameter family x(t) of random 
variables such that: 

(a). a(t) is normally distributed; the mean and the variance being inde- 
pendent of ¢ 

(b). the joint probability distribution of x(t), x(t), --- , x(l,) is multivariate 
Gaussian whose parameters depend only on the differences ¢; — ft, . 


1 John Simon Guggenheim Memorial Fellow. 































STATIONARY GAUSSIAN PROCESS 439 


We assume, for the sake of simplicity, that the process is normalized, i.e., 
E{x(t)} = 0, E{2°(t)} =1 

and we define the correlation function p(7) by the usual formula 

p(t) = E{x(t)x(t + 7)}. 


It is then well known’ that a distribution function o(u) exists such that for all 7 
(1) p(r) = [ cos ut do(u). 


3. Let 0 < s, £ < T and consider the symmetric kernel 
K(s, t) = p(s — 2). 


The fact that o(u) is non-decreasing implies that the kernel p(s — #) is quasi- 
definite, i.e., for every L’ function g(t) on (0, 7) one has 


| | g(s)p(s — tg(t) ds dt > 0. 


Thus the eigenvalues of the integral equation 
7 
(2) [ os - of at = 9) 


are non-negative. Moreover, denoting by A; the eigenvalues and by f;(é) the 
corresponding normalized eigenfunctions of (2) we have by the classical theorem 
of Mercer (see [4], in particular part 6 of Ch. I) that 


(3) p(s — ) = » ASi(NfiO, 


where the series on the right is absolutely and uniformly convergent. It should 
be noted that in virtue of (1) p(7r) is a continuous function. 


4. Let now G;, G:., Gs, --- be independent, normally distributed random 
variables each having mean 0 and variance 1. 
Consider the series 


(4) De VG Si. 
7 
Since for each ¢ we have 


DL (VF? = FTO = 0) = 1, 

d 7 

we infer that for each ¢ the series (4) converges in the mean to a random variable 
a(t). Moreover, by a theorem of Kolmogoroff [5], the series (4) converges, for 
each t, to x(t) with probability 1. 


2 See [2]. The theorem in question (in a somewhat different form) seems to have been 
first established by N. Wiener in [3]. 


440 M. KAC AND A J. F. SIEGERT 


Thus we may write 
(5) x(t) = DY VvGifid. 
7 
It is now easy to show that x(t) thus defined is a stationary Gaussian process 
in (0, 7’) with the correlation function p(7). 
In fact, 


E{x(s)x()} = zy SIMO = p(s —0D,0 < s,t < T, 


and conditions (a) and (b) of section 2 follow from the well known properties of 
linear combinations of independent Gaussian random variables. Of course, 
we are dealing here with infinite linear combinations but the mean convergence 
noted above, is sufficient to justify the extension to our case. 

5. It is more illuminating to think of the random variables G; as measurable 
functions G;(w) defined on an abstract set Q in which a Lebesgue measure has 
been established (the measure of the whole space being 1). 

The representation (5) can then be written in the equivalent form 


(6) (tra) =D) VAdG)f (0. 


The equality, as established in section 4, holds for every ¢ in the sense of mean 
convergence. Moreover, by the theorem of Kolmogoroff cited above, and by 
Fubini’s theorem the equality (6) holds for almost every pair (t, w), (0 < t < 7), 
in the sense of ordinary convergence. 

Furthermore by Mercer‘s theorem (remember that A; > 0) 


r 
oax= | p(s — s)ds = T 
i 0 


and hence 


> FG} = raf Ge = DyHT< @. 
7 i 7 


Gi) 
d 
converges for almost every w and therefore the series 
(7) DL VG 
7 


converges in the mean for almost every w. 

Combining this fact with the observation that (7) converges almost every- 
where to x(t, w) we see that, for almost every w, the series (7) converges in 
the mean to x(t, w) and that consequently 


















STATIONARY GAUSSIAN PROCESS 






(8) I x(t, w) dt = >) AGF) 


for almost every w. 

It should be noted that (8) could not, in general, be derived by just appealing 
to Parseval’s relation. The main reason is that Parseval’s relation holds only 
for complete orthonormal systems whereas the orthonormal system {f,(é)} of 
eigenfunctions may fail to be complete. If the kernel p(s — 2) is positive- 
definite (in which case all the eigenvalues are positive instead of just non-nega- 
tive) then it is known that the eigenfunctions form a complete set. This actu- 
ally, happens to be the case in most physical applications. 

6. An important application of (8) is the calculation of the characteristic 
function of the distribution function of the random variable 


(9) I = | ‘ a(t, w) dt. 
In fact, 
(16) E{exp (ié1)} = II Ejexp (i#;G?} = II (1 — dr). 


The probability density of J is the Fourier integral 
a : — 
= | exp(-a#) TL a - ay ae 
T J—o 7 


which, unfortunately, in most cases cannot be calculated explicitely. If 


—B\r 
p(r) = ell 















in which case the process is also Markoffian, the eigenvalues \; can be cal- 

culated explicitly’ but in more complicated cases it is quite difficult to deter- 

mine them. 

7. If p(7) is absolutely integrable and o(u) absolutely continuous then, setting 
A(u) = o’(u), 

we have A(u) > 0 and 


p(t) = [ cos urA(u) du = [ e” B(u) du, B(u) = ss 





3 See [6], in particular section 4. We take this opportunity to correct two misprints in 
this note. In the last formula on p. 64 M should be replaced by N. Also the limits of 
integration in formula (6) should be 0, s and s, p + q instead of 0, p + q and 0,p + q. 
The N.D.R.C. Report 14-305 to which a reference is made has been declassified in the 
meantime. It contains results which originated both [1] and the present note. 
4 These and related results were stated in the abstract [7] by M. Kac. The paper is now 
being prepared for publication. 


442 2. N. OBERG 


It can then be shown‘ that 


lim . Z i = an [ B*(u) du = [ p(r) dr 


T-2 T 7 


lim : ) * i = (2n)* [ B*(u) du. 


T-«0 J’ 7 


It follows now by standard methods that the characteristic function of 


(11) ail 20 dt — r\ 


approaches, as 7’ —> «, 
2 
a 
o(-Ze), 


o =| p'(7) dr. 


Thus, as T — o, the distribution of (11) becomes normal with mean 0 and 
variance o°. 


where 


REFERENCES 

[1] M. Kac ann A. J. F. Srecert, ‘On the theory of random noise in radio receivers with 
square law detectors.’”’ To appear in Jour. of Applied Physics. 

[2] A. KaintcHINE, “‘ Korrelationstheorie der statidnaren stochastischen Prozesse,’’ Math. 
Ann., Vol. 109 (1934), pp. 604-615. 

[3] N. Wrener, ‘‘Generalized harmonic analysis,’’ Acia Math., Vol. 55 (1930), pp. 117-258. 

[4] G. HaMEt, Integralgleichungen, Julius Springer, Berlin, 1937. 

[5] A. Kotmogororr, “Uber die Summen durch den Zufall bestimmter unabhiingiger 
Gréssen,’”’ Math. Ann., Vol. 99 (1928), pp. 309-319. 

[6] M. Kac, ‘‘Random walk in the presence of absorbing barriers,’’ Annals of Math. Stat., 
Vol. 16 (1945), pp. 62-67. 

[7] M. Kac, ‘‘ Distribution of eigenvalues of certain integral equations with an application 
to roots of Bessel functions,’’ Abstract, Bull. Amer. Math. Soc., Vol. 52 (1946), 


pp. 65-66. 
ee 


APPROXIMATE FORMULAS FOR THE RADII OF CIRCLES 
WHICH INCLUDE A SPECIFIED FRACTION OF A 
NORMAL BIVARIATE DISTRIBUTION 
By E. N. Osrera 
University of Iowa 
1. Introduction. Given the normal bivariate error distribution 
(1) o(x, y) = (1/2mo2 aye YD, 
The purpose of this paper is to present certain approximate formulas for the 


radii of circles whose centers are at the origin, which include a prescribed pro- 
portion, p, of errors. The formulas are, for given oc, , o, , and p, 





APPROXIMATE FORMULAS 


(2) Ri = V 2020, In (1/{1 — pl) 

(3) @ = V(o2 + 03) In (1/[I — pl) 
and 

(4) Rs = (o2 + o,)V (1/2) In (1/[1 — p))- 


In section 3 we present tables of p’, the true proportion of errors contained in 
circles whose radii are given by the above formulas. These tables reflect the 
goodness of approximation of each formula to the true radius, R, for 0.1 < p S 
0.9and 0.5 S oz/c, £ 0.9. Also, a brief statement is included for the same range 
of p but with 0.1 S a./o, S - 4. 


2. The derivation of the formulas. The proportion p of errors that fall 
within an area A on the zry-plane is given by 


(5) p= [ ve, y) dA. 


If the area is bounded by any member of the family of elipses 
x /oz + y"/oy =X, 
the above integral may be evaluated directly. The result is 


—r2/2 


y= t-<¢ ’ 
whence 
” = 2In(1/[1 — p)). 


Thus the ellipse with semi-axes 
(6) oeV2In (1/[1 — pl), v2 In (1/{1 — 9), 


measured from the origin along the x and y axes respectively, will include ex- 
actly the prescribed proportion of errors. 

Frequently, however, it is desired to know which circles rather than which 
ellipses include a certain proportion of the errors. In this case it becomes 
difficult to obtain a formula for the true radius from (5) unless ¢, = o, in which 
case FR is given by either one of the formulas in (6). However, a natural ap- 
proximation to make is to equate the area of a circle of radius, say R, to the area 
of the ellipse whose semi-axes are given in (6). This gives formula (2), 


Ri = V/20,0, In (1/{1 — p)), 


which can be expected to give a fairly close approximation to true R if a; is 
close to o,. If oz, ¥ o,, it has been shown that this formula underestimates 
true R which is undesirable in some applications [1]. That is, if R; is used to 
estimate, say the radius of a circle to include 50% of the errors (p = .5), it will 
give a value which includes less than the desired proportion. The first table in 
the last section gives a numerical verification of this fact. 













444 





E. N. OBERG 


To obtain formula (3) we consider formula (5) when A is a circle of radius R. 


v 
We have v 
R pa/Rimat 
p= 4| f g(x, y) dy dx. if 
0 0 
4 
By making the transformation « = o:r cos 6, y = o,r sin 6, and by carrying out t 
the integration with respect to r the above formula becomes : 
iiss 1 whe 2 r eo BPI lez tos) — (ofa) c0828] dé. 
p 2/x) | : 


We let 
a=R/(o,+0,), B= (a; — 0:)/(o, + 03), 
and 
Oz/oy = €; oz < Oy. i 
Then 


a= R’/o5(1%+ é), and 8B = (1 — &)/(1 + &), which is less than unity. 
This substitution will be helpful later in preparing tables. The fact that o, 
is taken less than o, places no limitation on the final results since we only have 
to interchange axes in the other case. The above integral may now be written 


si 





n/2 
1 par (2/x) | e thIA—heos26) dé 
0 


s 
I 

















w/2 


= 2/m)e* | ¢aBe0828/(1—Be0828) 79 
0 


The integrand, say F(@), in the last integral of (7) can be shown to be monotone 
increasing from e “’’ * to e**"** as @variesfrom0to 2/2. Furthermore, it crosses 
the line F(@) = 1 somewhere in this interval and differs but little from it any- 
where if the ratio a,/c, is close to 1, since 8 is then close to zero. If, therefore, 
we replace the integrand by F(@) = 1, we have p = 1 — e *. Hence, if a is 
replaced by R’/(oz + o;) and the result solved for R, we have formula (3), 


Rs = V(o; + 03) In (1/[1 — pl). 
Finally, formula (4), 
Rs = (o2 + oy)V(4) In (1/1 — p)); 


is obtained by taking the root-mean-square of the former two. This formula 
has certain advantages over the other two, the most obvious being that o, and 
o, enter linearly so that it is simple to évaluate for given o,, o, ,"and p. Sec- 
ondly it will be seen by the tables and additional comments made in the last 
section that when p = 0.5,’ Rs overestimates true R by a slight amount for all 


a twits Oo Ge tte coh 











1This particular value of p gives the circular probable error. 
0.5887(o, + oy). 


In this case R3 = 


 — -.- —<—_ 


= Sr 8 











APPROXIMATE FORMULAS 445 





values of o,/o,, and it gives a fairly close approximation to true R for all p 
when o:/o, = 0.5. 


We close this section by making a few brief comments. In the first place, 
if any of the above formulas is to be computed from a sample of data, we take 
o/3a2/(n — 1) and +/Zy2/(n — 1) as estimates of ¢, and a, respectively. Fur- 
thermore, we test the significance of these statistics by known formulas [2]. 

‘ <2 
Finally, oz and o, may be replaced by 3 D, and / . D,, where D, is the 


population mean deviation. Thus, for example, 


Rs = (De + Dy) 4/7 in (1/11 - p)- 
3. Tables. The first formula in (7) is useful in testing by means of numerical 


integration the goodness of approximation of the formulas Rk, , R., and R; to 


TABLE I 
p computed by means of formula R, 
































* | | | 
\e | | 
\ a) 2] wy a] wo] oe] a] | eS 
o2/oy\ | | 
% | | cichlids een Riaestal iii 
oO 0988} .1951).2425) .2893) .3815] .4720) .5615) .6508| - 6960) 7422) .8408 
6 .0944 1974, .2459| . 2942) .3899| .4846) .5786| .6726).7198| .7676) .8668 
7 .0997| .1987| .2480) .2972) .3950} .4924! .5894) .6864) .7350).7838) .8835 
8 .0999 .1995) . 2492) . 2989) .3981) .4970) .5958) .6946) .7440) .7936) .8935 
.9 _ |.1000}.1999} .2498) .2997| .3996) .4993] .5991] .6988) .7483) .7986! .8985 
1.0 . 1000 - 2000) . 2500 .3000) . 4000) . 5000} .6000) . 7000) .7500) . 8000) .9000 


the true value of R. We construct the tables by replacing F# in a by one of these 
formulas, say formula R,. This gives a = [2 /(1 + &)][1/(1 — p)]. Since 
6 = (1 — e&)/(1 + &), the right hand side of the formula in (7) may then be 
evaluated for a choice of € and p giving a value we denote by p’. This is the 
actual proportion of errors that is included in the circle whose radius is R, . 
If R, gave true R, then p’ would be equal to p, so we may regard the difference 
of p and p’ as a measure of the error arising when R, is used to estimate R. 

In the following tables the chosen values of p and « = a-/c, are listed in the 
first row and column respectively. The remainder of the tables include the 
corresponding values of p’. 

We also have computed tables for 0.1 S o,/o, < 0.4 which we have not in- 
cluded in this paper since for this range of values of ¢,/o,, all of the formulas 
give approximations that depart considerably from true R except R3; when p = 
0.5. For this case, p’ = .4776, .5004, .5109, and .5120 when o,/o, = 0.1, 0.2, 
0.3, and 0.4 respectively. 


446 E. N. OBERG 


The difference between an entry in a column and the corresponding value 
of p at the head of the column reflects the error in estimating true R by means of 
R,, R:, and R;. For example, if p is chosen as .5 and o,/o, = .7 then R, 
gives the radius of a circle which includes 50.13% of the errors. Thus R, 
overestimates true R by including .13% more of the errors. 

By examining the tables it is seen that when 0.1 S p S 0.3, R; gives the best 
approximation to the true value of R, while R: gives the poorest. If0.4S5 ps 


TABLE II 


/ 
p computed by means of formula Rp 





s25 





'.1215 .2363) .2912).3446 .4467) 5432 .6346 .7217 -7641| .8060| .8907 
|.1116) .2202) .2732).3255).4274 .5261).6218 .7146 .7600) .8050).8949 
.1057| .2100) . 2616) .3127 .4140 .5136'.6116 .7081) .7558) .8032) .8976 
. 1022. 2039! . 2546 .3051).4056 .5055 .6048 . 7034) .7525 .§014|.8991 
|. 1005 .2009| . 2509) .3012).4013, .5012 .6011 .7008 .7506 . 8003} .8999 
|. 1000). 2006) . 2500) .3000) .4000 . 5000 .6000,.7000..7500) .8000) .9000 





TABLE III 
’ computed by means of formula R; 





8 


| 
| 


1102) .2161| .2674 .3176) .4152 .5092 .6001 .6887 .7327 .7768) 8694 
.1056 .2089| .2597|.3100).4090' .5059 .6009 .6944 .7408|.7872).8817 
.1027|.2044.2548) 3050) .4046 .5031 .6007|.6974).7456) .7937|.8908 
.1011! .2017).2519) .3020.4018) 5013 .6003 .6991) .7483|.7976| 8963 
|. 1003 2004) .2504) .3004| 4004 .5003 .6001 .6998 .7496) .7995) .8992 
|. 1000) .2000) .2500| .3000).4000 .5000 .6000.7000! .7500).8000! .9000 


0.75, Rs gives the best and R, the poorest; and if 0.8 < p S 0.9 Rez gives the best 
and R, the poorest. Thus formula R; for general use gives the best overall 
approximation. It may be remarked at this point that bounds for the true 
value of R can be found by applying two of the formulas, one of which over- 
estimates while the other underestimates R. From the tables it is apparent that 
this can be done for values of p S 0.8. 

Finally, these formulas may be used to test roughly the normality of the data. 
For example, if proper estimates’ of ¢, and oc, are made from the data, and the 


2See section 2. 





EFFICIENCY OF SEQUENTIAL TEST 447 


corresponding value of R; computed for a chosen p, then approximately, the 
proportion p’ of plotted errors should fall within the circle of radius R; . 


REFERENCES 


{1] Henry Scuerr£, Armor and Ordinance Report No. A-224, OSRD No. 1918, Div. 2, 
pp. 60-61. 


[2] S. S. Witxs, Mathematical Statistics, Princeton Univ. Press, 1943, p. 131. 


A NOTE ON THE EFFICIENCY OF THE WALD SEQUENTIAL TEST 
By Epwarp PauLson 
Institute of Statistics, University of North Carolina 


The sequential likelihood ratio test of Wald for testing the hypothesis H> 
that the probability density function is f(X, 6) against the one-sided alternative 
H, that the function is f(X, 6) has been shown [1] to have the optimum property 
of minimizing the expected number of observations at the two points 6 = @ 
and @ = 6,. Tables showing the actual magnitude of the percentage saving 
of this sequential procedure compared with the classical ‘‘best’’ non-sequential 
test have been calculated (see [1], page 147) for the normal case when 


f(X, 6) = 


In this note we will show that when 6; is close to 6) , the percentage saving is 
independent of the particular function f(X, 6) and the particular values 4, 
and 6, so that the tables mentioned above can be used to show the percentage 
saving for any one-sided sequential test involving a single parameter, provided 
f(X, 6) satisfies some weak restrictions. 

Let f(X, 6) be the probability density function of a random variable. Let 
E;(n) denote the expected value (when @ = 6;) of the number of independent 
observations required by the Wald sequential procedure to test the hypothesis 
Hy, that 6 = 6 against 6 = 6; = 6 + A with probabilities a of rejecting Hy 
when @ = 6 and 8 of accepting Hy when 6 = 6,. Let N be the number of in- 
dependent observations required to achieve the same probabilities a and 8 
for testing the hypothesis 6 = 6 against 6 = 6, by the most powerful non- 
sequential test. Let’'U. and Us be defined by the relations 


Paw {-Sa 





448 EDWARD PAULSON 
We will prove the following theorem: 


1-8 B 
2 a log * + (1 — a) log ( 
Limit 1 = —2 {olor (+) + - «) toe (2) 


A=0;—09—0 (Uz + Us)? 
provided f(X, 0) satisfies the following conditions: 


i 


(A) | f(X, 0) dx can be differentiated twice under the integral stgn with respect 


to @. 
(B) All four of the integrals 


0 f" (a, 6*) _ ee 
Se a) ~ | Fa, 0) | f P% %) a 
“SG, 60) 
a) f(a, 0) 


oo f'(x 60) | 

: x, 6*) dz, 
[leas | x 

- "(x ) Oo) 

—— f(a, 90) 

are continuous functions of 6* at 6* = 0. A sufficient condition for (B) is that 

all the integrals be uniformly convergent with respect to 6* in some interval % < 

6* < 6 + A, and all the integrands be continuous functions of X and 6*. A 


similar theorem holds regarding the limit of re 
A-—0 i 


The proof is as follows: From [1], we know that 


a log (+—*) + (1 — a) log (; £ .) 
Ey (n) = ence erage ern + o(1), 


ae f(x, 6;) 
me log Ke 5, | 


f' (x, 6*) dz, 


f(x, 6*) dz, 


and o(1) ~O0asA— 0. 
Now 


nis ~ [Ton ft) Jw a 


. log f(x, 0 + A)] f(x, 0) dx — [. log f(x, 4) f(a, 4) dx. 









EFFICIENCY OF SEQUENTIAL TEST 






Expanding log f(x, 4 + A) in a Taylor series about A = 0, we have 








- f(z, 6) , Xf" - f) A’ 
ngs + 8) = loesteyad) +a Fos + [EE | + oR, 


where 





—<0* <H& +A, f'= d f(z, 6) jf’ = a’ f(z, 6) 


06” ogee” 


- fi a s* 6=6* 
= | r L.: 


From assumption (A) we find that 














and 


| f' (x, 0) dx = 0 and | f'' (ax, %) dx = 0, 


while from assumption (B) 


| Rif(x, %) de ~O0asA—->0, 


a 41 1s) I. de +0 a |. 


To find N for the most powerful non-sequential test, we make use of the fact 
(see [2]) that an asymptotically most powerful test for one-sided alternatives is 
given by a region of the type 


1 i=¥ f' (x; ce) 
Uy= > — > K. 
When A — 0, N — ~~, and since Uy is the sum of N independent variates with 


(28 E(Uy) 
.. 


a finite second moment, the distribution of Un 


Therefore 

















approaches that of a 
UN 

normal variate with zero mean and unit variance. Hence we find the N re- 

quired for a test with Type I and Type II errors a and 8 by solving for N from 

the relations 





K itt 
, Vel," 
"\F /o=0 
and 
-_ soe 
kK - \/N E, r) 
(2) (4 6=60 _ 





eng, & Ue 
vB). - [2G] 








450 C. TRUESDELL 


Now let y = (5) , and we find from (1) and (2) that 
O69) 


N = = VEWy) + a ve) - aay. 
uy 


“T@, 60) 
Low f(x, 60) 


"SG, PY) ii. 7 j "TG, 90) 
A c Ha, 6s) f’ (x, 0) dx +A [. F(a, 6) 


AEvy*[1 + 0(1)] from assumption B. 
Proceeding in a similar manner, we find 
[Ua V Ey)? + Us V Ey)? — (Exel = Eoly*)[Ua + Up(l + o(1))P. 


We now have 


1-8 B 
Eun) _ __ATEo(y)A + 00)? © - ( a )+ - ( - :) 
N E,(y*)[Ua + U,(1 + o(1))}? -5 [Eo(y’) rs o(1)] 


f(x, 6) dx 


[f'(a, O)]e=s, ax 








therefore 


1— 8B B 
a log 4) + (1 — a) log (*.)| 
limit {He} = P| _ & l1—a/}. 


,;"™ (Oa + Us? 
REFERENCES 
[1] A. Waxp, ‘Sequential tests of statistical hypotheses”, Annals of Math. Stat., Vol. 
16 (1945). 


[2] A. Wap, “Some examples of asymptotically most powerful tests”, Annals of Math. 
Stat., Vol. 12 (1941). 


—0 


ee ee 


A NOTE ON THE POISSON-CHARLIER' 
FUNCTIONS 
By C. TRUESDELL 
Naval Ordnance Laboratory 


The polynomials p,(m, z) given by the definition 


(1) pe ip er Lr, 


dz” 


1This note was written while the author was employed by the Radiation Laboratory, 
M.L.T. 











POISSON-CHARLIER FUNCTIONS 451 


called the Poisson-Charlier polynomials, and the associated function ¥,(m, z) 
given by the definition 


(2) 





Vn(m, z) = Dn(m, z)yo(m, Z), 


(3) Yolm, 2) = “=, 
Mm. 







occur in statistics. Doetsch [1] has devoted a memoir to them, and they are 
noticed in Szegé’s Orthogonal Polynomials (pp. 33-34). 

I suggest that they are most directly and easily studied in connection with 
the ‘‘F-equation”’ 


(4) 5 Fe, a) = F(z, a + 1), 


whose properties and application to various special functions I have sum- 
marized in a recent note [2]. Using the theorems of that note, which I shall 
cite by number, I shall now generalize the Poisson-Charlier polynomials and 
sketch the speediest derivation of their most interesting formal properties. 

Greek letters shall represent unrestricted real numbers, while Latin letters 
shall represent integers. 

From the existence theorem for the F-equation (Theorem 4) we know that 
there exists an integral function of z, Fs(z, a), which satisfies the F-equation 
and the condition 













(5) ab a anil 4 ax( > 


From the uniqueness theorem for the F-equation (Theorem 4) it follows that 
(6) F,(z,n — B + 3) = 0, 
(7) F,(z, n) = 0, n> 0. 


From the general power series solution for the F-equation (Theorem 4) we have 
the formula 






















8) Fle, a) = 008 (a+ Ae (* )aii(as8 + a + 152). 


We now define the Poisson-Charlier functions in general by the formulas 


(9) Da(a, z) = T'(a@ + 1)z “F(z, —a), 





_ a 2" 
(10) va(a, z) = Tia + 1) pa(a, z). 
From the formulas (6) and (7) we see that [1, p. 263] 
(11) 





va(—n, z) = 0, n> 0; 


452 C. TRUESDELL 


(12) v(—n+B—3,2)=0, p(—n+8 — 3,2) =0. 


From the formula (8) we see that 


(13) pa(a, 2) = cos (8 — a)r aot 2 Fi(—a;B —at 1; 2), 


whence it follows at once that 


(14) pa(m, z) = cos Br = (7) (?) k\(—z)~*. 
kenO L 


This is the usual explicit expression for the Charlier polynomials [1, p. 257]. 
From formula (13) we see that 


(15) sillicnae db en = ra — a)s"(e, 2). 


In the indeterminate case when a is a negative integer we see from the formula 
(14) that 


(16) pom, z) = 1, m = 0. 
Hence 


sin2amr -, 


(17) Yo(—a, 2) = — é “y(a, 2), 


—Z m 


e 2 
m! 


(18) Yo(m, z) — 
From the definition (10) we now see that 
(19) Ya(m, z) — Pa(m, z)Yo(m, z); 


a generalization of the formula (2). From the formula (13) and the definition 
(10) we see that 


“ T(a + 1) 
I(6 + 1)l(a — B + 1) 
Then by Kummer’s first transformation, 
. T(a + 1) 
16 + 1)f(a — B+ 1) 


from which it follows from the power series formula for solutions of the F-equa- 
tion (Theorem 4) that ¥.(8, z) is a solution of the F-equation (4). 

We now have two different solutions of the F-equation based on the Poisson- 
Charlier functions: 


(A) F(z, a) = e'ps(—a, 2). 


(20) yal(8,z) = cos (8 — a) e *,F\(—B;a—B+1;2). 


(21) yYa(8,z) = cos (8 — a) Filatl;a—B+1;-—2), 









POISSON-CHARLIER FUNCTIONS 











(B) F(z, a) = ya(8, 2). 


From the F-equation it is evident that 
a 
(22) Vn(B, 2) — Oz" Yo(B, Z), 


whence we at once deduce the formula (1). Applying Taylor’s theorem for the 
F-equation (Theoremf8) to the solution (B) we see that [1, p. 259] 





(23) ¥a(B,2 +h) = a» = Warn(B, 2) 5 

putting a equal to zero we find that 

(24) oa ~~ e*"y(—8, c + h) = _ _ Yn(B, Z), 
2r nao 1! 


and, more specially [1, p. 260] 
(25) (1 + s) oat 


ni Palm, 2). 


Applying the same theorem to the solution (A) we obtain the formula 


2 


(26) EYyla,2 +h) =" yola— 1,2), 


n=0 


whence we recover the formula (11) by putting a equal to zero. 
Applying Theorem 9 to the solution (B) yields the result 


(27) ¥ fvern(8, 2) = [ & Wa(B, 2 + Ot) dB, 


N=0 


which contains as a special case the formula 


(28) > t*pr(m,z) = (l+e)7"™ (‘)° or E ~ v(m +1, e(1 +1))} 


Appell’s generating expansion (see Theorem 10, part C or [8, p. 120]) applied 
to the solution (A) yields the result 





(29) Xu va(n,z + yt” =e yy va(n, y)t"; 
hence 

= ( _ anit) F yt y pa(n, y) 
(30) 2 n! mln, 2 + ¥) =e 2 (. +y ni * 


Putting y equal to zero and using the formula (13) we see that 





454 C. TRUESDELL 
20 f s . t B 

(31) 2d | p(n, 2) = ¢(1— -) cos Bz. 
n=0 . « 


Comparing this result with the formula (25) we see that 
(32) (—)"pa(m, z) = (—)"pm(n, 2). 


It would be possible to proceed in this same fashion and discover many other 
formal properties of the Poisson-Charlier functions, but it is perhaps easier to 
notice from the formula (13) that 


(33) pa(a, z) = cos (8 — a)al(a + 1)27*7LS-™ (2). 


3(x) being Laguerre’s function suitably generalized for complex lower index 
[4, p. 53]. By means of this formula every relationship involving Laguerre 
functions may be translated into one involving Poisson-Charlier functions. 


REFERENCES 


[1] G. Dorrscn, “Die in der Statistik seltener Ereignisse auftretenden Charliersche 
Polynome, und eine damit zusammenhangende Differentialdifferenzengleichung”, 
Math: Ann., Vol. 109 (1934), pp. 257-266. 


“ce . . 0 7 rT 
[2] C. TruEspELL, “‘On the functional equation ra F(z, a) = F(z, a +1)”, Proc. Nat. 
z 


Acad. Sci. U.S. A., April, 1947. 

[3] P. Appe.t, “Sur une classe de polynomes”, Ann. Ecole Norm., Vol. 9 (1880), pp. 
119-144. 

[4] E. Pinney, “‘Laguerre functions in the mathematical foundations of the electro- 
magnetic theory of the paraboloidal reflector’, Jour. Math. Phys. M.I.T., 
Vol. 25 (1946), pp. 49-79. 





ABSTRACTS OF PAPERS 


Presented June 17-19, 1947, at the San Diego meeting of the Institute 


1. Random Variables with Comparable Peakedness. Z. W. Birnsavum, 
University of Washington. 


Let U and V be random variables with symmetrical distributions, i.e. with P(U s —T) = 
P(U = T) and P(V s —T) = P(V 2 T) forall JT 2 0. The random variable U shall be 
called more peaked than V if P(|}U| = T) s P(|V| = T) forall T =O. Let X,Y; and 
Xz ,Y2 be two pairs of independent random variables such that X; is more peaked than Y; 
fori = 1,2. Then under certain additional conditions X = X, + X2 is more peaked than 
Y=Y,+ /f:2. 


2. On Optimum Tests of Composite Hypotheses with One Constraint. Ericu 
L. LEHMANN, University of California, Berkeley. 


The problem studied is that of finding all similar and bisimilar test regions of composite 
hypotheses, and of obtaining the most powerful of these regions. Various results are ob- 
tained for distributions which admit sufficient statistics with respect to their parameters. 
Applications are made to the hypothesis specifying the value of the circular correlation 
coefficient in a normal population, and certain hypotheses concerning scale and location 
parameters in exponential and rectangular populations. 


3. Estimation of a Distribution Function by Confidence Limits. FRANK J. 
Massey, Jr., University of California, Berkeley. 


Let 71 , 22 , -++ , 2, be the results of n independent observations, having the same cumula- 
tive distribution function F(x). Form the function S,(x) = k/n where k is the number of 
observations less than or equal to z. A confidence band S,(z) + A/+~/n will be used to 
estimate F(x). To determine the confidence coefficient it is necessary to find Pr{max Wn 
| Sn(z) — F(z) | S/n}. It is sufficient to consider z uniformly distributed in the interval 
(0,1). Let \0/n = s/t where s and ¢ are integers. Then S,(x), to stay in the band F(z) + 
\/+/n, can only pass through certain lattice points above x = i/in,i =1,2,---,tn. The 
probability of S,(x) passing through a particular sequence of these points is given by the 
multinomial law, and this can be summed over all permissible sequences. Limiting dis- 
tributions have been given by A. Kolmogoroff, and by N. Smirnoff. It is desired to test 
the hypothesis F(z) = Fo(x) against alternatives F(x) = F,(x). Using the criterion: reject 
F(x) if 


max Vn | Fo(x) — Sn(x)| > d 
z 


the probability of first kind of error can be controlled by choice of 4. A lower bound to the 
probability of second kind of error against alternatives such that max ~/n | Fo(x) — F(z) | = 
Ais given. This lower bound approaches one asn— ~. Thus the test is consistent. 


4. A Note on Sequential Confidence Sets. CHARLES Stern, Columbia Uni- 
versity. 

This paper generalizes a paper of Stein and Wald, appearing in the Annals of Math. Stat., 
Sept., 1947. 

Let {Xi}, (¢ = 1, 2, ---), be a sequence of random variables whose distribution depends 
on an unknown parameter 6. Sequential confidence sets are determined by a rule indicating 


455 




















456 ABSTRACTS OF PAPERS 






when to stop sampling and a rule giving the confidence set as a function of the sample. It 
is desired that, for each sample point, the confidence set should be one of a specified class S, 
that the probability of covering the true parameter should be 2 a, and that the least upper 
bound of the expected number of observations should be minimized. If X; are inde- 
pendent with the rectangular distribution on (0, @) and S consists of all intervals of the 
form (0) , k@) with k fixed and 6) a function of the sample, the optimum sequential pro- 
cedure is the classical non-sequential procedure. If the X; are independently and identi- 
cally distributed in accordance with a multivariate normal distribution with known co- 
variance matrix = but unknown mean @, and the confidence sets are to be of the form (@ — @)’ 
=-1(@ — 6) = r, r fixed, 0) a variable p-dimensional vector, a similar result holds, provided 
the desired confidence coefficient @ is not excessively small. 


5. Explicit Solution of the Problem of Fitting a Straight Line when Both 
Variables are Subject to Error for the Case of Unequal Weights. Eizaseta 
L. Scorr, University of California, Berkeley. 


Let a, 8 and £;,(¢ = 1, 2, --- , s), be unknown fixed numbers and let nj = a+ B&;. For 
each value of 7 there exist m; measurements 2;; of &; and n; measurements yx of 7: , (j = 
1,2,---,m:;k =1,2,---,n;:). The variables z;; and y;. are normally distributed about 
é; and n; with variances o; /u; and o,/v; respectively, where the weights u; and v; are known 
but o; and a; are unknown. The numbers m; and n; are bounded (usually small) while s 
increases indefinitely. Thus a, 8, a; and o; appear as structural parameters and the é; as 
incidental parameters. (See paper by J. Neyman and E. L. Scott to appear in Econometrica.) 
Modified maximum likelihood equations (MMLE) yielding consistent estimates of the 
structural parameters are tedious to solve when the products m;u; and nv; depend on i. 
The main result of this paper consists in proving that the varying m;u; and/or nj,v; can be 
treated as constants. Let w; and w. be the harmonic means of m;u; and niv;, respectively. 
Now, MMLE’s written with mu; = w: and njv; = w2 yield consistent estimates of a and 8. 
The asymptotic variances are also found. An application is made to certain problems of 
astronomy. 


6. Unbiased Estimates with Minimum Variance. CHARLES STEIN, Columbia 
University. 

Let X be a random variable distributed in the space R according to one of the p.d.f.’s 
¢(z | 6), where @ is an unknown parameter, and let g(@) be a real-valued function of 8. 


Let B(@) be the set of all x such that (x | 0) > 0 but ¢ (x | 6) = 0, and S the set of all 6 
such that B(@) has probability 0 when @ is the true parameter value. Let 


v(x | 0) = v(x | &)/e(x | O and A(A ,62) = E{y(X | 6:1) ¥(X | 62) | 40} 
for 6, , 6. in S. Suppose A(@ , 6.) is everywhere finite and there exists a set function A 


of bounded variation over S such that | A(6: , 62) dd(0,) = g(@2). Then an estimate of 
- Ss 
g(@), unbiased for all @ in S and having minimum variance at 6 is given by f(z) = 


[oe 6) dd(6)/e(x | 0). The minimum variance is [ 0 dd(6) — [g(@)]?. If the 
Ss Ss 


definition of f(z) is modified at a set having probability 0 when @ = 6, the properties on S 
and at @ remain unchanged. Under mild restrictions this alteration can be carried out so 
as to make f(x) an unbiased estimate of @ for all S. The results are related to the work of 
Fisher, Dugué, Rao, and Bhattacharyya on the amount of information. 


7. Sufficient Statistics and a System of Partial Differential Equations. (A 
Contribution to the Neyman-Pearson Theory of Testing Hypotheses.) Pre- 














ABSTRACTS OF PAPERS 457 


liminary Report. Erica L. Lenmann, University of California, Berkeley, AND 
Henry Scuerrf, University of California, Los Angeles. 


In the Neyman-Pearson theory of testing hypotheses the problem of the existence and 
determination of similar regions has been treated under two approaches: (1) Assuming the 
existence of a set of sufficient statistics for the nuisance parameters; (2) Assuming that the 
probability density satisfies a certain system of partial differential equations. By solving 
the differential equations it is now shown that they imply the existence of sufficient statistics 
for the nuisance parameters. Knowledge of the form of the solution of the differential 
equations permits simplification of the known theory of optimum tests (type B, B: , etc.) 
as well as some generalization. 


8. Power Function of the Analysis of Variance and Covariance Test of a 
Normal Bivarfate Population. W.M.CueEn, University of California, Berkeley. 


The problem of finding the power function of the analysis of variance and covariance test 
of a normal bivariate population, p = 0 and o; = o2, by means of principle of likelihood 
was reduced to the determination of the distribution function P(L) of the following moment 
problem: 


1 — g)(n-1)/2 © oF iit 
[ rMh« 3-52 yor (*S +r) any, (k = 1,2, +++), 


0 r n—1 aon wt 
2 


4*ET(n sw SE (ed ba Ce + r) 


where 


2 


r(*3*)r(2434 +r)ra- 1+2k +7) 


and a, the argument of the power function, lies in the interval (0, 1) and vanishes only when 
the hypothesis tested is true. The moment problem was found and solved by rather tricky 
methods. The result is 


(1 — b)&-Di2 2 oe /n—1 a9 
Pip ogee Fe + 0 nS e+1) 


r n—1 & s! 
2 
2 
where b= ( = y. 
l-—a 


9. A Mathematical Model of the Relation between White and Yolk Weights 
of Birds’ Eggs. G. A. Baker, University of California, Davis. 


The purpose of such a model is to find a rational method of estimating a “best line” in 
some sense which will represent the relation between white and yolk weights for some or all 
species of birds. From data at hand it appears that birds within species may differ in 
means and variances of weights and that the yolk and white weights are positively corre- 
lated. Yolk and white weights within a species are functions of egg number. The standard 
deviations of yolk and white weights for different species are approximately proportional 
to mean values. The “true” means for yolk and white weights for different species do 
not lie on a line because of biological differences between species with the same egg size. 
The standard deviation of species deviations from a straight line depend on the size of the 
egg (may be proportional to a weighted sum of the yolk and white weights). If sampling 


Mir = 














458 ABSTRACTS OF PAPERS 


variances are sufficiently small they may be neglected and a straight line fitted assuming 
both variables subject to error and non-uniform variance. The practicality of maximum 
likelihood estimates is considered. 


10. Statistical Analysis for a New Procedure in Sensitivity Experiments. A. 
M. Moon, Iowa State College, anp W. J. Drxon, University of Oregon. 


In the language of biological assay the sensitivity experiment investigates the proportion 
of subjects that respond to a given concentration, z, of a certain chemical. It is assumed 
that only one test may be made on each subject. The new procedure is characterized by a 
change in x for each successive test, depending on the result of the preceding test. 2 is 
reduced to the next lower of a fixed set of concentrations for the next test if there is no 
response and is increased to the next higher concentration if there is a response. Observa- 
tions are thus concentrated near the mean and few tests are made for valies of x where a 
very large or very small proportion of subjects would respond. Assuming z is normally 
distributed, approximate maximum likelihood estimates are obtained for the mean and 
standard deviation of x. These assume a form which is simple tocompute. Choice of op- 
timum increments of z for various situations is investigated. 


11. The Relation of Inbreeding to Calf Mortality. W.M. Reaan, S. W. 
MEAD, AND P. W. Grecory, University of California, Davis. 


An analysis of calf mortality in the University of California dairy cattle breeding experi- 
ment is presented. Calves up to 4 months of age that were born singly are included in the 
study. Only those stillbirths and abortions from cows free from Brucellosis and health 
and reproductive abnormalities were considered. A total of 774 Jersey and 258 Holstein 
calves were included. Calves were classified according to inbreeding coefficients as follows: 
Class I, the controls 0.0 to 0.1249; Class II, 0.125 to 0.2448; Class III, 0.245 to 0.3749; and 
Class IV, 0.375 and over. There was no relation between the number of abortions and the 
degree of inbreeding. The stillbirths, too few to be statistically significant, tend to increase 
as the coefficient of inbreeding increased. Following birth, however, mortality was corre- 
lated with inbreeding of both males and females but for the males it was greater than for 
the females in Classes III and IV, but the difference is hardly significant. The Jerseys 
tended to be less viable than the Holsteins. Some of the increased mortality of the more 
highly inbred animals could be accounted for by the action of two lethal genes; one con- 
trolling an anomaly of the liver, the other an anomaly of the heart; there was no plausible 
explanation for most of it. Within sex, inbreeding class, and breed there was considerable 
variation in the mortality of the progeny of different sires. Some of these differences were 
statistically significant. 


12. Observations on Designs for Cooperative Field Tests. P. A. MINGEs, 
University of California, Davis. 


In California conditions vary so greatly between the principal production areas that it 
is necessary to establish experimental plots in each of the areas if reliable information is 
to be obtained regarding cultural practices. Most of these tests must be conducted on 
ranches in cooperation with growers and local agricultural extension agents. The designs 
of these tests should be relatively simple, the arrangement should be adjustable to work 
into the growers’ cultural practices and to permit the obtaining of yield records with a 
minimum of interference to the growers’ operations, yet the design must be adequate to 
yield valid data. The randomized block design has proved the most useful, although paired 
plots, factorials, split-plots and Latin squares have been used successfully under certain 
conditions. The Latin square design is useful when a two-way variation is expected, other- 























ABSTRACTS OF PAPPRS 459 


wise it is not usually very efficient. Where yield data are of prime importance, for ureplica- 
tions have been considered most practical. In tests such as variety trials when factors 
other than yields are important, two replications may be adequate. The size of the plot 
has been varied to fit the crop, conditions of the field, and known soil variability. Plots 
two rows wide and 50 to 135 feet long often have been used, frequently without guards 
between plots. Since it is desirable to include checks (untreated controls) in most tests, 
small plots will reduce the loss to the growers when the treatments prove beneficial. The 
information derived from these tests is of most interest to growers and county agents so 
the data should be presented in tables that are easily read. The variability figure which is 
confusing to most people probably can best be presented as the least significant difference. 


13. Population Genetics. 
nology. 


N. H. Horowrrz, California Institute of Tech- 


Population genetics attempts to describe the effects on the genetical structure of Mende- 
lian populations of factors such as mutation, selection, migration, and random fluctuations 
due to sampling errors. These diverse elements are brought under a common viewpoint by 
considering their effects on gene frequencies. Since change in gene frequency is the ele- 
mentary process of evolution, the above factors are causal agents of evolution. Mathe- 
matical models illustrating the interplay of the various elements have been constructed by 
Wright, Haldane, and Fisher. The nature of Mendelian inheritance is such that gene 
frequencies remain constant in large populations not subject to net mutation, selection, or 
migration pressures. Unbalanced pressures initiate evolutionary changes which continue 
until equilibrium is reached at a new level of gene frequencies. Equilibrium frequencies are 
determined by opposing pressures—e.g., opposing mutation rates, mutation opposed by 
selection, etc. Equilibrium, stable or unstable, is also possible under selection alone. In 
small populations, sampling errors among the gametes produce random fluctuations in gene 
frequencies which, superimposed on the equilibrium values, result in probable distributions 
of frequencies. The latter provide a mechanism for the evolution of characters, especially 
biochemical syntheses, which depend on the simultaneous action of a number of individually 
non-adaptive genes. 


14. The Choice of Inspection Stringency in Acceptance Sampling by Attributes. 
J. L. Hopgss, Jr., University of California, Berkeley. 


In acceptance sampling by attributes, the probability p that an item will be defective is 
taken to be a function g(x, y) of the quality z of the population and the stringency y of 
inspection. Let n, the number of items inspected, be fixed, and reject if the number of 
defectives is = k. It may then be possible to satisfy a condition on the power function 
with different values of k, by adjusting y properly. This paper is concerned with the choice 
of k and y in such situations. A criterion is given, and it is shown that the criterion is 
approximately satisfied by k = [ng(xo , ¥)] where 2» separates acceptable and non-acceptable 
values of z, and 7 maximizes 


dg(zx0, y) a 
—— | Vite yIl — g(zo, y)). 


An asymptotic property of this approximation is shown. The method is applied to two 
examples: (a) testing the mean bacterial density z of a liquid by the dilution method, y 
being the volume of liquid incubated, and (b) testing the variance z of a normally dis- 


, 1 . 
tributed dimension of known mean m by applying gauges set atm +-. The approximate 
y 


solution is found to be satisfactory in both cases for m = 20. 








460 ABSTRACTS OF PAPERS 


15. The Application of Learning Curves to Industrial Planning. Preliminary 
Report. James R. Crawrorp, Lockheed Aircraft Corporation. 


Learning curves are significant factors of analysis in industries producing quantities of 
less than 20,000 units of a given article. Ship-building and airframe manufacture are the 
two largest industries in this class. Learning curves occur where job costs are kept either 
by individual unit or by lot, and also where achievement is measured against a standard. 
Cost per unit plots against ordinal unit number as a straight line on logarithmic graph- 
paper. Learning curves are used to supplement time-studies, determine the capacity of 
tooling, layout of budgets, and for estimating and bidding. The experience of individual 
workers and management are reflected in these analyses. The slope of the learning curve 
is related to the amount to be learned. Plateaus occur which are related to the hiring of 
new workers and to the relaxing of control measures. Other consistent minor patterns 
occur which are related to specific conditions. Equations have been derived and tables 
computed for five related forms of the learning curve. Graphic methods are satisfactory 
except for bidding. This study covers a simple approach to an important problem of indus- 
trial management. The findings in the industrial field may benefit research in the field 
of the psychology of learning. 


16. Relative Effects of Inbreeding and Selection in Poultry. W.O. Witson, 
University of California, Davis. 


Egg production rate, fertility, hatchability, and chick mortality records from the Iowa 
State College Poultry Department’s inbreeding project were studied. Statistics which 
were calculated from the data included simple and partial regression of traits on inbreeding, 
estimates of heritability by correlation between paternal half-sibs and by daughter-dam 
regressions, and selection differentials. The net genetic gain or loss in merit per generation 
was considered to be the sum of the product of selection differentials and heritability, plus 
the product of regression of trait on inbreeding and increase in amount of inbreeding. The 
amount of inbreeding that cen be done in each of the traits was estimated when there was no 
net loss or gain. Of the traits studied, the rank was in the following order: Hatchability, 
chick mortality, fertility, and egg production. 


17. The Rate of Genetic Gain in Egg Production in Progeny-Tested Flocks 
as a Function of the Interval between Generations. Everett R. DEMPSTER AND 
I. MicHaret LERNER, University of California, Berkeley. 


The rate of genetic gain in a character for which selection is practiced depends in addition 
to the intensity of selection on (1) the accuracy of selection, and (2) the average interval 
between generations. These factors are not independent and exercise a pull in opposite 
directions. Through the application of Wright’s technique of path coefficients comparisons 
can be made between the expected rates of genetic gain in populations containing varying 
proportions of breeding animals of different ages. The methods used involve the estimation 
of correlations between genotypes, and various selection indexes based on individual, sib 
and progeny records in inculled populations as well as in populations whose range has been 
restricted by previous selection. From these estimates the relative efficiencies of different 
age distribution schemes of a breeding population can be determined. A specific solution 
for such a situation in a flock bred for egg production will be presented as an illustration of 


the problems and methods used in the study of the genetics of populations under artificial 
selection. 


18. Statistical Criteria of the Effectiveness of Selective Procedures. Prelim- 
inary Report. R. F. Jarrett, University of California, Berkeley. 





ABSTRACTS OF PAPERS 461 


The “validity coefficient,” the standard error of estimate, the index of predictive effi- 
ciency, the “selection ratio” of Taylor and Russell, Johnson’s Gamma, and other statistical 
devices have been suggested as indices of the effectiveness of selective programs. These 
devices all suffer from the deficiency that they do not permit a satisfactorily precise estimate 
of the dollar value of the increased output expected from the selection program and thus 
leave unsettled the question as to whether or not the cost of such a program is justified. 
The relationship between the correlation coefficient on the one hand and the mean value of 
Y for an unselected population (Y being an objective output-type criterion), the standard 
deviation of Y for an unselected population, and the mean value of Y for the upper Np in- 
dividuals selected on the basis of their high performance on the selective test X on the other 
hand, provides the basis for estimating the increase in the mean output of a group of workers 
selected on the basis of a testing program yielding any specified validity coefficient with the 
criterion Y. Increase in productivity of selected workers is shown to be a function of the 
validity coefficient, the rigorousness of selection, and the coefficient of variability of the 
output criterion among “unselected” employees. 


19. Approaches to Univocal Factor Scores. Preliminary Report. J. P. 
GUILFORD, University of Southern California. ; 


In spite of the fact that univocal factor scores are badly needed for various reasons, it 
appears to be impossible by present methods to construct pure tests for some common 
factors. Recourse must therefore be made to statistical control of component variances. 
It is desirable to derive each factor score from a minimum number of tests. The availability 
of a few univocal tests makes this requirement fairly easy to satisfy. Such tests serve well 
as suppression variables for their common-factor variances where not wanted in other tests. 
Several principles may be invoked as objectives: (1) to maximize the desired variance in 
the impure test, (2) to reduce the undesired variance to zero, or (3) to minimize the undesired 
variance without intolerable loss of the desired variance. A secondary objective is to 
assure a combining weight of +1.00 for the test measuring the desired factor. Equations 
for achieving the objectives have been derived and the limitations and implications of each 
procedure have been noted. By means of statistical control, the situation seems hopeful 
for the achievement of univocal scores for a fairly large number of unique psychological 


variables. There are implications for experimental psychology as well as for vocational 
testing: 


20. A Note on the Problem of Binary Stars. Exizanperu L. Scorr, Uni- 
versity of California, Berkeley, 


This paper concerns some of the problems of Trumpler (see next abstract). &; is the 
radial velocity of the 7-th star, i=1, 2,---, s, at ¢; selected at random, j = 1, 2,---,n. 24, 
measurement of éj;, is N (ij, oi). i is random with distribution c (ki — (&i; — £io)*) 4 
where kj = O and i are unknown. (1) Test of hypothesis that kj = O. Case (i) oj known. 
Whatever the exact test 7’, its power Br(k) has derivative 67(0) = 0. Test maximizing 

n 


87(0) is that of Trumpler with criterion S? = > (xij — 24)? > x20;. Case (ii) oj un- 
j=l 
known. W hatever the exact test 7, 8°” (0) = 0, m = 1, 2,3. Test maximizing 8{*(0) is 


Trumpler’s test > (ry — z¢)4 > [> (ry — zo] C. (2) Let 7 (ij — io)? = 2A; 07. 


j=l j=1 


j=l 

For constant velocity stars \=0. For others it is a random variable. Since, given \ = 0, 
S? is distributed as non-central x?, an integral equation connects the distributions of S? 
and X. Its solution yields an estimate of the proportion of constant velocity stars. After 
estimating the distribution of A, the level of significance can be estimated and also the 





462 ABSTRACTS OF PAPERS 


number 7 of measurements so that the proportion of constant velocity stars declared vari. 
able will be less than p, specified in advance. 


21. Statistical Problems of Spectroscopic Binaries. Rosert J. TRUMPLER, 
University of California, Berkeley. 


Spectroscopic Binaries are stars whose radial velocities, as measured by the Doppler 
shift of spectral lines, show a periodic variation. The first problem is to obtain a statistical 
criterion for deciding whether a star with several radial velocity measures, made at different 
times, has a high probability (larger than a specified limit) of variable velocity and should 
be announced as an object worthy of further study. The second problem is to find the 
percentage of variable velocity stars among a large list of stars with several radial velocity 
measures for each star. From the distribution of standard errors only the percentage of 
cases where the velocity variation exceeds a certain limit can be ascertained. The third 
problem is concerned with those stars for which a binary orbit has been determined. The 
statistical distribution of these binary systems according to mean distance between the two 
stars and the ratio of their masses can be evaluated within certain limits. 





NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 


Mr. Kenneth J. Arrow has been appointed Research Associate of the Cowles 
Commission. 

Dr. W. D. Baten, formerly of Michigan State College, is now Chief, Opera- 
tions Branch, Planning Section, Air Defense Command, Mitchel Field, New 
York. 

Dr. Paul T. Bruyere is now Chief of the Medical Records and Statistics Branch, 
Army Institute of Pathology, Office of the Surgeon General, War Department. 

Dr. A. C. Cohen received his discharge from the Army, with the rank of Lieu- 
tenant Colonel, at the beginning of the spring quarter, and returned to his former 
position at Michigan State College. He has accepted a position at the Uni- 
versity of Georgia beginning with the 1947 summer session there. 

Dr. Hallett H. Germond has resigned from his position as professor of mathe- 
matics at the University of Florida. He is now Director of Research for the 
S. W. Marshall firm of Consulting Engineers, in New York City. 

Dr. Meyer A. Girshick, formerly with the Department of Agriculture, is now 
with the Douglas Aircraft Company in Santa Monica, California. 

Dr. Clyde H. Graves has accepted a position as Operations Analyst, Opera- 
tions Analysis, Air Defense Command, Mitchell Field, New York. 

Dr. E. J. Gumbel has been appointed to an Associate Professorship at Brook- 
lyn College. 

Dr. Trygve Haavelmo has returned to Norway, and is at the University 
Institute of Economics, Oslo. 

Mr. Joseph O. Harrison, Jr., is now employed as a Mathematician in the 
Computing Branch of the Ballistic Research Laboratories, Aberdeen Proving 
Ground. 

Dr. Wassily Hoeffding has accepted a psoition as Research Associate, The 
Institute of Statistics, University of North Carolina, Chapel Hill. 

Mr. Cyrus A. Martin is now an administrative analyst and statistician, as- 
sisting Chief of Personnel Control of Signal Corps, in Washington, D. C. 

Mr. Jack I. Northam has accepted an Assistant Professorship in the Depart- 
ment of Mathematics, Kansas State College, Manhattan, beginning with the 
1947 summer session. 

Professor Henry Scheffé, who has been on leave for the past year, returned to 
his position in the Engineering Department, University of California at Los 
Angeles, in June. 

Mr. Edward M. Schrock has accepted a position as Quality Control Engineer 
with the General Electric Company at their Erie Works, Erie, Pa. 

Mr. Jerome R. Steen, who has been manager of Quality Control Engineering 

463 








464 NEWS AND NOTICES 


with the Sylvania Electric Products in Emporium, Pa., has now transferred with 
the same company to Flushing, New York. 
ee RR A 


Professor Emeritus Irving Fisher, of Yale University, died April 29, 1947, 
at the age of eighty. 


(a RI 





In connection with the Atlantic City meeting of the American Chemical 
Society, April 14-18, 1947, a symposium on Statistical Methods in Experimental 
and Industrial Chemistry was held, in which several members of the Institute 
of Mathematical Statistics took part. The following program was presented 
Tuesday morning and afternoon, April 15: 

(1) Introductory Remarks. B. L. Clarke. 

(2) The Management Viewpoint. George Smith. 

(3) A New Technique for Testing the Accuracy of Analytical Data. W. J. 


Youden. 
Discussion: Grant Wernimont, R. F. Moran, John Mandel, and Roland 
H. Noel. 


(4) Design of Experiments in Industrial Research. Hugh M. Smallwood. 
(5) Statistical Training for Industry. Samuel 8S. Wilks. 

Discussion: John Tukey, E. V. Lewis, Churchill Eisenhart, and C. West 
Churchman. 


Se eR 
Preliminary Actuarial Examinations 


Prize Awards 


The winners of the prize awards offered by the Actuarial Society of America 
and the American Institute of Actuaries to the nine undergraduates ranking 
highest in the combined score on Part 1 and Part 2 of the 1947 Preliminary 
Actuarial Examinations are as follows: 


First Prize of $200 











os eal dh ccunel Cakaed a wee Nema Chee University of Toronto 
Additional Prizes of $100 
NI Or St td Cates oe ate Oo ahd alas Wie Yale University 
NNN RIN ooo ei clsiae iwi earceneis Sle a ee OEE Rutgers University 
Peep UTR N 58g le ies ics Saran d teri bups 2 atsers ray ie anole eRe NRG Harvard University 
TEE IONE ihc ese SwenmhGuereru dom tamsiunetemecse Men University of Manitoba 
ARAN te Ree c.g cis setae AS swe Be oho ae iale ictal eh eae eee Rutgers University 
So 8 achat Rha WA wha AGM euae Man Keak University of Buffalo 
en PORE 52 555 6 ocio.0 so DECREASES Seo eSSedaeT ERM Brown University 
INN Re gr IR ooo ooo oo aa Gras od ene Swernwste-Mredetes Guslens University of Toronto 


The two actuarial organizations have authorized a similar set of nine prize 
awards for the 1948 Examinations. 



























NEWS AND NOTICES 465 


The Preliminary Actuarial Examinations consist of the following three examina- 
tions: 
Part 1. Language Aptitude Examination 


(Reading comprehension, meaning of words and word relationships, antonyms, 
and verbal reasoning). 


Part 2. General Mathematics Examination 


(Algebra, trigonometry, coordinate geometry, differential and integral 
calculus). 


Part 3. Special Mathematics Examination 
(Finite differences, probability and statistics). 
The 1948 Examinations will be administered by the College Entrance Examina- 
tion Board at centers throughout the United States and Canada on May 14-15, 
1948. 


Cn as nln 


Correction 


In the Directory of Members published in Vol. XVII, No. 4 (December 1946) 
Professor Joseph Kampé de Feriet’s name is listed in the F’s under Feriet. It 
should have appeared in the K’s, under Kampé de Feriet. 


(eR a 


New Members 


The following persons have been elected to membership in the Institute (March 1 to May 30, 

1947): 

Adams, Joe K. Ph. M. (Wisconsin) Graduate student and half-time instructor in Psy- 
chology, Graduate College, Princeton University, Princeton, N. J. 

Adams, Walter B. Communications Analyst, Civil Aeronuatics Admin., Dept. of Com- 
merce, 8253 S. Ingleside Ave., Chicago 19, Ill. 

Aitken, Alexander C. D.Sc. (Edinburgh) Professor of Mathematics, University of Edin- 
burgh, 23 Stirling Road, Edinburgh 5, Scotland 

Brambilla, Francesco Ph.D. (Univ. L. Bocconi) Lecturer in Math. Statistics, Institute 
of Statistics, Universita L. Bocconi, 6 via Panzacchi, Milano, Italy 

Brown, George Middleton, D.Sc. (Michigan) Asst. Prof. of Math., Mich. State College, East 
Lansing, Mich., 633 Cherry Lane 

Bueno, Luiz de Freitas, E.E. (Mackenzie Coll.) Professor da Universidade de Sao Paulo, 
Brazil, Rua Itambé 341, Casa 13 

Burke, Cletus J., M.A. (U.C.L.A.) Res. Ass’t, Univ. of Iowa, Iowa City, Iowa, 118 River- 
side Park 

Cameron, Joseph M., M.S. (N. Car. State) Room 302 South Building, National Bureau of 
Standards, Washington, D.C. 

Carpenter, Osmer M.S. (Iowa State) Instructor, Mathematics Department, Iowa State 
College, Ames, Iowa 

Castellani, Maria D.Sc. (Rome) Visiting Professor, Department of Mathematics, Uni- 
versity of Kansas City, Kansas City 4, Mo 

Chernoff, Herman Sc.M. (Brown) National Research Council Pre-Doctoral Fellow, 3003 
Wallace Ave., Bronx 67, N. Y. 

Clark, Stanley M.Ed. (Saskatchewan) Student and teaching assistant, 1301-7th St., S.E., 
Minneapolis 14, Minn. 





466 NEWS AND NOTICES 


Cover, John H. Ph.D. (Columbia) Director, Bureau of Business and Economic Res., 
Univ. of Maryland, College Park, Md. 

Dailey, John T. M.S. (N. Texas Teachers Coll.) Res. Psychologist (Aviation), Psycho- 
logical Res. and Examining Unit, Sqn. E, Indoctrination Div., Air Training Command, 
San Antonio, Texas 

Darling, Donald A. Ph.D. (Calif. Inst. Tech.) Teaching Ass’t, Calif. Inst. of Technology, 
Pasadena 4, Calif. (Asof July 1947, Dept. of Math., Cornell Univ., Ithaca, N. Y.) 

Darmois, Georges D.Sc. (Paris) Prof. dla Faculté des Sciences de Paris, 7 Rue de l’Odéon, 
Paris 6, France 

Davies, J. Alfred M.A. (Alabama) Statistician, Design Eng. Section, General Electric 
Co., 708 Hill Avenue, Ownesboro, Kentucky 

Dunnett, Charles W. M.A. (Toronto) Student, 1044 John Jay Hall, Columbia Univ., 
New York 27, N. Y. 

Egermayer, Frantisek Sc.D. (Charles Univ., Prague) Chief of Section, State Statistical 
Office, 2 Bélského, Prague VII, Czechoslovakia. 

Fickenscher, Edgar H. A.B. (Calif.) Graduate student and teaching ass’t, Univ. of Calif., 
1490 Acton St., Berkeley 2, Calif. 

Fraga, Constantino G. Jr. (Sao Paulo) Head, Dept. of Statistics, Instituto Agronomico, 
Campinas (S.P.), Brazil. 

Frank, Elmore J. B.A. (Chicago) Instr. in Statistics, Ill. Institute of Tech., and Statisti- 
cian, Commercial Res. Dept., Armour and Co., 5423 Maryland Ave., Chicago 15, Ill. 

Frisch, Ragnar Ph.D. (Oslo) Professor, University Institute of Economics, Oslo, Norway. 

Geary, Robert C. D.Sc. Superintending Officer, Statistics Branch, Dept. of Industry and 
and Commerce, 27 Leeson Park, Dublin, Ireland. 

Goodman, John R. M.S. (Iowa State) Head, Sampling Section, Survey Res. Center, 
Univ. of Mich., Ann Arbor, Mich. 

Gutman, Pierre M.A. (Columbia) Student, 7 Mountain Ave., Maplewood, N. J. 

Hartline, H.K. M.D. (Johns Hopkins) Assoc. Prof. of Biophysics, Johnson Res. Founda- 
tion, Univ. of Pennsylvania, 36th and Spruce Sts., Philadelphia, Pa. 

Hartog, Jacob A. (Rotterdam) Rockefeller Fellow, 25 Follen St., Cambridge, Mass. 

Jacobs, Marcus A.B. (Penn.) Health Statisticisn, 4439 S. 36th St., Arlington, Va. 

Jeeves, Terry A. A.B. (Calif.) Teaching ass’t in math., Univ. of Calif., 2511 Hearst Ave., 
Berkeley 9, Calif. 

Kempthorne, Oscar M.A. (Cambridge, England) Res. Assoc. Prof., Statistical Lab., Iowa 
State College, Ames, Iowa 

Kendall, David G. M.A. (Oxford) Fellow, Magdalen Coll., Oxford, England 

Kent, Leonard M.B.A. (Chicago) Instr. in Statistics, School of Business, Univ. of Chic- 
ago, Chicago 37, Ill. 

Kupperman, Morton B.S. (C.C.N.Y.) Statistician, Office of the Surgeon General, War 
Dept., 2829-27th St., N.W., Washington 8, D.C. 

Lhati, Elizabeth L. M.A. (Michigan) Statistician, Bur. of Measurement and Guidance, 
Carnegie Institute of Technology, Pittsburgh 13, Pa. 

Levine, Harry D. B.S. (Chicago) Instr., Long Island Univ., 164 W. 96 St., New York 26, 
NY. 

Lichtenstein, Morris B.A. (Michigan) Statistician, 4811 N. Capitol St., N.E., Washington 
11, D.C. 

MeMillan, Brockway Ph.D. (Mass. Inst. Tech.) Member, Technical Staff, Bell Tele- 
phone Labs., Murray Hill, N. J. 

Marshall, AndrewW. Student, 5757 University Ave., Chicago 37, Ill. 

Metzner, Charles A. Ph.D. (Wisconsin) Study Director, Survey Research Center, Univ. 
of Michigan, Ann Arbor, Mich. 

Norton, John W. B.S. (California) Lab. Supervisor, Union Oil Co. of Calif., 5529 Mac- 
donald Ave., Richmond, Calif. 





NEWS AND NOTICES 467 


Otter, Richard Ph.D. (Indiana) Instructor, Fine Hall, Princeton Univ., Princeton, N. J. 

Passos, Helena Rocha Penteado Diretor de Divisio do Depto. Estadual de Estatistica de 
Sao Paulo, Avenida Angélica, 160, Apto.6, Séo Paulo, Brazil 

Priest, EdwardI. B.S. (Columbia) Student in mathematics, 1204 E. 55th St., Chicago, Ill. 

Quensel, Carl-Erik Fil.Dr. (Lund) Prof. at the University, Lund, Sweden, Linnegatan 
14 

Rankin, Mozelle M.A. (Ohio State) Ass’t Instructor, Ohio State Univ., 107-14th Ave., 
Columbus 1, Ohio 

Robb, Richard A. D.Sc. (Glasgow) Mathematics Lecturer and Mitchell Lecturer in 
Statistics, Univ. of Glasglow, Glasgow, W. 2, Scotland. 

Ruist, Erik Fil.kand. (Stockholm) Amanuens, Industriens utredningsinstitut, Stock- 
holm 16, Sweden 

Shani, Inder M. M.A. (Punjab) Rothamsted Experimental Station, Harpenden, Herts, 
England. 

Schneider, B. Aubrey Sc.D. (Johns Hopkins) Ass’t Director, Dept. of Statistics and 
Special Services, American Cancer Society, 47 Beaver St., New York 4, N. Y. 

Seitz, Jiri Ph.D. (Prague) Koutimské 8, CSR, Praha XII, Czechoslovankia. 

Simaika, Jacques B. Ph.D. (London) Lecturer, Faculty of Science, Fuad I University, 
Abbassia, Cairo, Egypt. : 

Slatin, Benjamin M.A. (Columbia) Jr: Analyst, Econometric Institute, 179 Peshine 
Ave., Newark 8, N. J. 

Suydam, Bergen R. A.B. (N.Y. State Coll. for Teachers) Graduate student, Columbia 

University, 1 W. 706 St., Shanks Village, Orangeburg, N. Y. 

Tashmuhamed, Sarymsakov Ph.D. (Moscow) President of the Academy of Sciences of 

Uzb.SSR, Professor of the University, Tashkend, ul. Abdulli Tukaeva I, Tashkent, 
USSR 

Travers, Robert M. W. Ph.D. (Columbia) Examiner, and Assoc. Prof. of Education, 

Bureau of Psychological Services, Univ. of Mich., Ann Arbor, Mich. 


Weiner, Sidney B.S. (C.C.N.Y.) Student, New York University Graduate School, 1639 
East 17th St., Brooklyn 80, N.Y. 

Wezelman, Sol M. A.B. (Michigan) Graduate student, University of Michigan, Ann 
Arbor, Mich., 2432 Burt St., Omaha, Nebr. 

Wishart, John D.Sc. (London) Reader in Statistics, School of Agriculture, Cambridge, 
England 





REPORT ON THE NEW YORK MEETING OF THE INSTITUTE 


The Twenty-Sixth Meeting of the Institute of Mathematical Statistics was 
held in New York City on Thursday, April 24, and Friday, April 25, 1947, and 
was co-sponsored by the American Mathematical Society. This meeting was 
devoted to a program on Stochastic Processes and Noise. The attendance of 
190 persons included the following 75 members of the Institute: 


F. A. Acton, C. B. Allendoerfer, F. L. Alt, T. W. Anderson, Jr., L. A. Aroian, W. D. Baten, 
Robert Bechhofer, J. H. Bigelow, D. H. Blackwell, Paul Boschan, G. W. Brown, R. 8. Bur- 
ington, B. H. Camp, E. W. Cannon, A. G. Carlton, K. L. Chung, P. C. Clifford, D. D. Cody, 
Harald Cramér, H. B. Curry, J. H. Curtiss, R. L. Dietzold, J. L. Doob, Jacques Dutka, 
Churchill Eisenhart, Benjamin Epstein, Will Feller, M. M. Flood, Bernard Friedman, C. P. 
Gerschenson, H. H. Goode, C. H. Graves, E. J. Gumbel, T. E. Harris, Millard Hastay, L. H. 
Herbach, P. G. Hoel, Mark Kac, R. D. Keeney, T. C. Koopmans, William Kruskal, Jack 
Laderman, J. E. Lieberman, 8. B. Littauer, Melitta Lowy, P. J. McCarthy, Brockway Mc- 
Millan, Frederick Mosteller, L. F. Nanni, P. M. Neurath, G. E. Noether, M. L. Norden, C. 
O. Oakley, P. S. Olmstead, G. B. Price, J. S. Rhgdes, John Riordan, Selby Robinson, Frank 
Saidel, Arthur Sard, F. E. Satterthwaite, G. R. Seth, C. E. Shannon, Jack Sherman, W. A. 
Shewhart, Rosedith Sitgreaves, Andrew Sobezyk, Milton Sobel, Emma Spaney, C. M. 
Stein, J. W. Tukey, D. F. Votaw, Jr., B. T. Weber, 8. S. Wilks, Jacob Wolfowitz. 


The first session, was held on Thursday morning, with Professor Carl Al- 
lendoerfer of Haverford College serving as chairman. The following program 
was presented: 

Stochastic Processes— 


Description, Professor J. L. Doob, Columbia University 
Estimation, Professor Will Feller, Cornell University 
Prediction, Professor N. Wiener, Massachusetts Institute of Technology 


This meeting was concluded with a discussion by Dr. H. W. Bode, Bell Telephone 
Laboratories, Professor Mark Kac, Cornell University, and Professor A. Wald, 
Columbia University. 

Dr. S. O. Rice, Bell Telephone Laboratories, was chairman of the Thursday 
afternoon session. The following program was presented: 


Stochastic Processes in Some Applications— 


In Economics, Dr. T. Koopmans, Cowles Commission 

In Insurance, Professor H. Cramér, Yale University 

In Cosmic Radiation, Professor N. Arley, Princeton University 
In Nuclear Physics, Dr. 8. M. Ulam, Los Alamos Laboratory 


The final session was held on Friday morning with Professor J. W. Tukey 
of Princeton University as chairman. The program was as follows: 


Different Ways of Describing Noise— 


By a Noise Spectrum, Dr. C. E. Shannon, Bell Telephone Laboratories 
By a Single Function, Mr. J. E. Bigelow, Institute for Advanced Study 
By Many Functions, Professor Mark Kac, Cornell University 
Round Table on Interrelations, Messrs. Shannon, Bieglow, Kac, and Rice 
P.S. DWYER, 
Secretary. 


468 





REPORT ON THE APRIL MEETING OF THE INSTITUTE IN 
ATLANTIC CITY 


The Twenty-Seventh Meeting of the Institute of Mathematical Statistics was 
held in cooperation with the Eastern Psychological Association, on Saturday 
morning, April 26, 1947, in Atlantic City. This meeting was a Round Table 
on Certain Recent Statistical Developments, and its attendance of approximately 
100 persons included the following 9 members of the Institute: 


F. 8. Acton, J. W. Dunlap, Benjamin Epstein, Irving Lorge, P. J. McCarthy, Frederick 
Mosteller, P. J. Rulon, F. E. Satterthwaite, and Emma Spaney. 


Professor Bernard Riess of Hunter College was chairman of the meeting. 
The following program was presented: 


Papers: Sequential Analysis. 
Dr. Irving Lorge, Teachers College, Columbia University 
Staircase Methods. 
Dr. Philip J. McCarthy, Cornell University 
Inefficient Statistics. 
Dr. Frederick Mosteller, Harvard University 


Discussion: Dr. Jack W. Dunlap, Psychological Corporation 
Dr. Leon Festinger, Massachusetts Institute of Technology 
Dr. William E. Kappauf, Princeton University 
Dr. Joseph Zubin, New York Psychiatric Institute Hospital 
P.S. DWYER, 
Secretary. 


er Be LE LL RS 


SEED) SE DEC Aa EE 











REPORT ON THE SAN DIEGO MEETING OF THE INSTITUTE 


The first Western Regional meeting of the Institute of Mathematical Statistics 
was held in San Diego, California, June 17-19, 1947, jointly with the American 
Association for the Advancement of Science. The meeting was attended by 53 
persons, including the following 31 members of the Institute: 


G. A. Baker, Joseph Berkson, Z. W. Birnbaum, H. C. Carver, Harald Cramér, J. R. 
Crawford, Dorothy Cruden, W. J. Dixon, Robert Dorfman, M. W. Eudey, Evelyn Fix, 
John Gurland, J. L. Hodges, Jr., J. M. Howell, H. M. Hughes, E.S. Keeping, E. L. Lehmann, 
R. H. Lien, F. J. Massey, G. F. McEwen, Frederick Mosteller, 8. W. Nash, Jerzy Neyman, 
Kathryn B. Rolfe, Henry Scheffé, Herbert Solomon, C. M. Stein, Zenon Szatrowski, H. M. 
Walker, J. D. Williams, Zivia S. Wurtele. 


The afternoon session on June 17 was a joint meeting with the Group of 
Former Operations Analysts. The following program was presented under the 
chairmanship of Col. Roscoe C. Wilscn: 


Topic: Statistical Problems in Operations Analysis. 
Papers: Engineering and Statistics at the Pacific Front in World War II. 
Roger Wilkinson, Bell Telephone Laboratories, New York City. 
Present Organization and Activities of Operations Analysis. 
Leroy A. Brothers, Operations Analysis, Asst. Chief of Air Staff-3, Washington, 
D.C. 
Statistical Evidence of Bomb Release Malfunctions. 
Mark W. Eudey, University of California, Berkeley. 
Siudy of Effectiveness of Certain Bombs Used Against German Industrial Targets. 
J. Neyman, University of California, Berkeley. 


The morning session on June 18 was presented with Professor Alva R. Davis 
as chairman, and the program was as follows: 


Topic: Statistical Problems in Biology. 
Papers: A Mathematical Model of the Relation between White and Yolk Weights of Birds’ 
Eggs. 
G. A. Baker, University of California, Davis. 
Statistical Analysis for a New Procedure in Sensitivity Experiments. 
W. J. Dixon, University of Oregon, and A. M. Mood, Iowa State College. 
The Relation of Inbreeding to Calf Mortality. 
P. W. Gregory, University of California, Davis. 
Cooperative Field Trials. 
P. A. Minges, University of California, Davis. 
Population Genetics. 
N. H. Horowitz, California Institute of Technology. 
Statistical Problems in Assessing Methods of Medical Diagnosis, with Particular 
Reference to X-Ray Technique. 
J. Yerushalmy, United States Public Health Service, Washington, D. C. 
Discussion: J. Neyman, University of California, Berkeley. 


Professor John W. Miles was chairman of the afternoon session on June 18, 
which was a joint session with the California Section of the American Society 
for Quality Control. The following papers were presented: 

470 


Topic: 
Papers: 


SAN DIEGO MEETING 471 


Industrial Applications of Statistics. 

Operating Characteristics of Average and Range Charts. 

Henry Scheffé, University of California, Los Angeles. 

Sampling Inspection by Variables. 

Herbert Solomon, Stanford University. 

Some Exact Numerical Results for Sequential Acceptance Sampling by Attributes. 
Mark W. Eudey, University of California, Berkeley. 

Choice of Inspection Stringence in Acceptance Sampling by Attributes. 
Joseph L. Hodges, University of California, Berkeley. 

Widening Tolerances for Closer Fitting Parts. 

Edmond E. Bates, Quality Engineering Consultants, Los Angeles. 
Discussion: Russell O’Neill, University of California, Los Angeles. 
Re-establishing Operator Responsibility for Quality Control. 

Wyatt H. Lewis, General Electric Company, Ontario, California. 
Discussion: William B. Rice, Plomb Tool Company, Los Angeles. 
The Application of Learning Curves to Industrial Planning. 

James R. Crawford, Wright Field, Dayton, Ohio. 


The Wednesday evening session was under the chairmanship of Professor 
George Beadle, California Institute of Technology, with the following program: 


Topic: 
Paper: 


Statistical Problems in Genetical Studies in Chickens. 


Rate of Genetic Gain in Egg Production in Progeny-tested Flocks as a Function of 
the Interval between Generations. : 


Everett R. Dempster and I. Michael Lerner, University of California, Berkeley. 


On Thursday morning, June 19, there was a joint session with the Western 
Psychological Association. Professor Helen Walker of Columbia University was 


chairman. 


Topic: 
Papers: 


The program was as follows: 


Statistical Problems in Psychology. 

Statistical Criteria of the Effectiveness of Selective Procedures. 

R. F. Jarrett, University of California, Berkeley. 

Unsolved Statistical Problems Arising in Psychological Measurements. 
Helen Walker, Columbia University. 

Cost Utility Curves as a Means of Assessing Batteries of Tests. 

Joseph Berkson, Mayo Clinic. 

Approaches to Univocal Factor Scores. 

J. P. Guilford, University of Southern California. 


The afternoon session on June 19 was under the chairmanship of Professor 
Harald Cramér of Stockholm, Sweden, and offered the following program: 


Topic: 
Papers: 


Theory of Statistics and its Applications to Astronomy. 
Random Variables with Comparable Peakedness. 

Z. W. Birnbaum, University of Washington. 

Distributions which Lead to Regressions Representable by Polynomials. 
Evelyn Fix, University of California, Berkeley. 

Oplimum Tests of Composite Hypotheses with One Constraint. 
Erich L. Lehmann, University of California, Berkeley. 
Estimation of a Distribution Function by Confidence Limits. 
Frank J. Massey, Jr., University of California, Berkeley. 

A Note on Sequential Confidence Sets. 

Charles Stein, Columbia University. 


IEG 


eas 


5 RDS A BeETT 





ee eeres ay 


ans 


Divas eer ES a Leena Bal 


wi 


ee 


: 
i 
a 
b 
© 
"4 
4 
2 
4 
i 
r 
a 
4 
d 
; 


SAN DIEGO MEETING 


Certain Types of Statistical Problems in Astronomy. 

Robert J. Trumpler, University of California, Berkeley. 

Basic Concepts of the Theory of Statistics in Relation to Certain Problems of 
Astronomy. 

J. Neyman, University of California, Berkeley. 

A Note on the Problem of Binary Stars. 

Elizabeth L. Scott, University of California, Berkeley. 

Explicit Solution of the Problem of Fitting a Straight Line when both Variables 
are Subject to Error for the Case of Unequal Weights. (By title) 

Elizabeth L. Scott, University of California, Berkeley. 

Power Function of the Analysis of Variance and Covariance Test of a Normal 
Bivariate Population. (By title) 

Way Ming Chen, University of California, Berkeley. 

Unbiased Estimates with Minimum Variance. (By title) 

Charles Stein, Columbia University. 

Sufficient Statistics and a System of Partial Differential Equations. (By title) 
Erich L. Lehmann, University of California, Berkeley, and Henry Scheffé, 
University of California, Los Angeles. 


On Wednesday evening, June 18, at 6 o’clock, there was a dinner for members 
and guests, at the Hotel San Diego. 


P. 8S. DWYER 
Secretary 








